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Abstract 

Szemeredi's regularity lemma is a fundamental tool in extremal combinatorics. However, 
the original version is only helpful in studying dense graphs. In the 1990s, Kohayakawa and 
Rodl proved an analogue of Szemeredi's regularity lemma for sparse graphs as part of a general 
program toward extending extremal results to sparse graphs. Many of the key applications of 
Szemeredi's regularity lemma use an associated counting lemma. In order to prove extensions 
of these results which also apply to sparse graphs, it remained a well-known open problem to 
prove a counting lemma in sparse graphs. 

The main advance of this paper lies in a new counting lemma, proved following the functional 
approach of Gowers, which complements the sparse regularity lemma of Kohayakawa and Rodl, 
allowing us to count small graphs in regular subgraphs of a sufficiently pseudorandom graph. 
We use this to prove sparse extensions of several well-known combinatorial theorems, including 
the removal lemmas for graphs and groups, the Erdos-Stone-Simonovits theorem and Ramsey's 
theorem. These results extend and improve upon a substantial body of previous work. 
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1 Introduction 

Szemeredi's regularity lemma is one of the most powerful tools in extremal combinatorics. Roughly 
speaking, it says that the vertex set of every graph can be partitioned into a bounded number 
of parts so that the induced bipartite graph between almost all pairs of parts is pseudorandom. 
Many important results in graph theory, such as the graph removal lemma and the Erdos-Stone- 
Simonovits theorem on Turan numbers, have straightforward proofs using the regularity lemma. 

Crucial to most applications of the regularity lemma is the use of a counting lemma. A counting 
lemma, roughly speaking, is a result that says that the number of embeddings of a fixed graph H 
into a pseudorandom graph G can be estimated by pretending that G were a genuine random 
graph. The combined application of the regularity lemma and a counting lemma is known as 
the regularity method, and has important applications in graph theory, combinatorial geometry, 
additive combinatorics and theoretical computer science. For surveys on the regularity method and 
its applications, see [57 1 162 } [76]. 

One of the limitations of Szemeredi's regularity lemma is that it is only meaningful for dense 
graphs. While an analogue of the regularity lemma for sparse graphs has been proven by Ko- 
hayakawa |54] and by Rodl (see also [441 180] ). the problem of proving an associated counting 
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lemma for sparse graphs has turned out to be much more difficult. In random graphs, proving such 
an embedding lemma is a famous problem, known as the KLR conjecture f55j, which has only been 
resolved very recently [TOl [26] . 

Establishing an analogous result for pseudorandom graphs has been a central problem in this 
area. Certain partial results are known in this case [591161]. but it has remained an open problem to 
prove a counting lemma for embedding a general fixed subgraph. We resolve this difficulty, proving 
a counting lemma for embedding any fixed small graph into subgraphs of sparse pseudorandom 
graphs. 

As applications, we prove sparse extensions of several well-known combinatorial theorems, in- 
cluding the removal lemmas for graphs and groups, the Erdos-Stone-Simonovits theorem, and 
Ramsey's theorem. Proving such sparse analogues for classical combinatorial results has been an 
important trend in modern combinatorics research. For example, a sparse analogue of Szemeredi's 
theorem was an integral part of Green and Tao's proof [52] that the primes contain arbitrarily long 
arithmetic progressions. 

1.1 Pseudorandom graphs 

The binomial random graph Gn,p is formed by taking an empty graph on n vertices and choosing to 
add each edge independently with probability p. These graphs tend to be very well-behaved. For 
example, it is not hard to show that with high probability all large vertex subsets X, Y have density 
approximately p between them. Motivated by the question of determining when a graph behaves in 
a random-like manner, Thomason [88] [89] began the first systematic study of this property. Using 
a slight variant of Thomason's notion, we say that a graph on vertex set V is (p, P)- jumbled if, for 
all vertex subsets X,Y OV, 

\e{X,Y) - p\X\\Y\\ < (3y/\X\\Y\. 

The random graph Gn,p is, with high probability, (p, /3)-jumbled with f3 = 0{^/pn). It is not hard 
to show [331 [35j that this is optimal and that a graph on n vertices with p < 1/2 cannot be {p, (3)- 
jumbled with /3 = o{y/pn). Nevertheless, there are many explicit examples which are optimally 
jumbled in that /3 = 0{^Jpn). The Paley graph with vertex set Zp, where p = l(mod 4) is prime, 
and edge set given by connecting x and y if their difference is a quadratic residue is such a graph 
with p = \ and /3 = 0{^/n). Many more examples are given in the excellent survey [64] . 

A fundamental result of Chung, Graham and Wilson [19] states that for graphs of density p, 
where p is a fixed positive constant, the property of being (p, o(n))-jumbled is equivalent to a 
number of other properties that one would typically expect in a random graph. The following 
theorem details some of these many equivalences. 

Theorem. For any fixed < p < 1 and any sequence of graphs (r„)„gN wUh |F(r„)| = n the 
following properties are equivalent. 

Pi: r„ is (p, o{n))- jumbled, that is, for all subsets X,Y CI V{Tn), e{X,Y) = p\X\\Y\ -|- o(n^); 

P2: e(r„) > ^(2) + o(?^^); Ai(r„) =pn + o{n) and |A2(r„)| =o{n), where XiiTn) is the ith largest 
eigenvalue, in absolute value, of the adjacency matrix ofTn', 

Pa.- for all graphs H , the number of labeled induced copies of H in F^ is p^'^n^ -|- o{n^), where 
£ = V{H); 

P4: e{Tn) > p{2) +o{n'^) and the number of labeled cycles of length 4 in F„ is at mostp'^n^+o{n^). 
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Any graph sequence which satisfies any (and hence all) of these properties is said to be p- 
quasirandom. The most surprising aspect of this theorem, already hinted at in Thomason's work, 
is that if the number of cycles of length 4 is as one would expect in a binomial random graph then 
this is enough to imply that the edges are very well-spread. This theorem has been quite influential. 
It has led to the study of quasirandomness in other structures such as hypergraphs [Ml [U] , groups 
[49], tournaments, permutations and sequences (see [TS] and it references), and progress on problems 
in different areas (see, e.g., [22l dSl |l9] ) . It is also closely related to Szemeredi's regularity lemma 
and its recent hypergraph generalization [48l [Ml (771 187] and all proofs of Szemeredi's theorem |85] 
on long arithmetic progressions in dense subsets of the integers use some notion of quasirandomness. 

For sparser graphs, the equivalences between the natural generalizations of these properties 
are not so clear cut (see [561 IH] for discussions). In this case, it is natural to generalize 
the jumbledness condition for dense graphs by considering graphs which are (p, o(pn))-jumbled. 
Otherwise, we would not even have control over the density in the whole set. However, it is no 
longer the case that being (p, o(pn))-jumbled implies that the number of copies of any subgraph 
H agrees approximately with the expected count. For H = i^s^s and p = n~^/^, it is easy to see 
this by taking the random graph Gn,p and changing three vertices u, v and w so that they are each 
connected to everything else. This does not affect the property of being (p, o(pn))-jumbled but it 
does affect the K^^^ count, since as well as the roughly p^n^ = copies of K^^^ that one expects 
in a random graph, one gets a further Q(n^) copies of K^^^ containing all of u, v and w. 

However, for any given graph H one can find a function fSfj '■= /3}{{p,n) such that if F is a 
{p, /3//)-jumbled graph then F contains a copy of H. Our chief concern in this paper will be to 
determine jumbledness conditions which are sufficient to imply other properties. In particular, we 
will be concerned with determining conditions under which certain combinatorial theorems continue 
to hold within jumbled graphs. 

One particularly well-known class of (p, /3) -jumbled graphs is the collection of {n,d, X)- graphs. 
These are graphs on n vertices which are d-regular and such that all eigenvalues of the adjacency 
matrix, save the largest, are smaller in absolute value than A. The famous expander mixing lemma 
tells us that these graphs are (p, /3)-jumbled with p = d/n and (5 = \. Bilu and Linial [TT] proved 
a converse of this fact, showing that every (p, /3)-jumbled d-regular graph is an (n, d, A)-graph with 
A = 0(/3 \og{d/ /?)). This shows that the jumbledness parameter (3 and the second largest in absolute 
value eigenvalue A of a regular graph are within a logarithmic factor of each other. 

Pseudorandom graphs have many surprising properties and applications and have recently at- 
tracted a lot of attention both in combinatorics and theoretical computer science (see, e.g., [64j ) . 
Here we will focus on their behavior with respect to extremal properties. We discuss these properties 
in the next section. 

1.2 Extremal results in pseudorandom graphs 

In this paper, we study the extent to which several well-known combinatorial statements continue 
to hold relative to pseudorandom graphs or, rather, (p, /3)-jumbled graphs and (n, d, A)-graphs. 

One of the most important applications of the regularity method is the graph removal lemma 
[H [78]. In the following statement and throughout the paper, v{H) and e{H) will denote the 
number of vertices and edges in the graph respectively. The graph removal lemma states that 
for every fixed graph H and every e > there exists 5 > such that if G contains at most dn""^^^ 
copies of H then G may be made H-fiee by removing at most en^ edges. This innocent looking 
result, which follows easily from Szemeredi's regularity lemma and the graph counting lemma, has 
surprising applications in diverse areas, amongst others a simple proof of Roth's theorem on 3-term 
arithmetic progressions in dense subsets of the integers. It is also much more difficult to prove than 
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one might expect, the best known bound [37] on being a tower function of height on the order 
of log e~^. 

An analogue of this result for random graphs (and hypergraphs) was proven in [25j. For pseu- 
dorandom graphs, the following analogue of the triangle removal lemma was recently proven by 
Kohayakawa, Rodl, Schacht and Skokan [59]. 

Theorem. For every e > 0, there exist 5 > and c > such that if P < cp'^n then any {p,f3)- 
jumbled graph T on n vertices has the following property. Any subgraph of T containing at most 
5p^n^ triangles may be made triangle-free by removing at most epn^ edges. 

Here we extend this result to all H. The degeneracy d{H) of a graph H is the smallest non- 
negative integer d for which there exists an ordering of the vertices of H such that each vertex 
has at most d neighbors which appear earlier in the ordering. Equivalently, it may be defined as 
d{H) = max{5(ff') : H' C where 6{H) is the minimum degree of H. Throughout the paper, 
we will also use the standard notation A{H) for the maximum degree of H. 

The parameter we will use in our theorems, which we refer to as the 2-degeneracy d2{H), is 
related to both of these natural parameters. Given an ordering vi, . . . ,Vm of the vertices of H and 
i < j, let Ni-i{j) be the number of neighbors Vh of Vj with h < i — 1. We then define d2{H) to be 
the minimum d for which there is an ordering of the edges as vi, . . . , Vm such that for any edge ViVj 
with i < j the sum Ni^i{i) + Ni^i{j) < 2d. Note that d2{H) may be a half-integer. For comparison 
with degeneracy, note that < d2{H) < d{H) — ^ and both sides can be sharp. 

Theorem 1.1. For every graph H and every e > 0, there exist 5 > and c > such that if 
/3 < cpd-'i-{f^)+'^n then any (p, (3) -jumbled graph T on n vertices has the following property. Any 
subgraph ofV containing at most Sp^^^^n^^^^ copies of H may be made H-free by removing at most 
epn^ edges. 

We remark that for many graphs H , the constant 3 in the exponent of this theorem may be 
improved, and this applies equally to all of the theorems stated below. While we will not dwell on 
this comment, we will call attention to it on occasion throughout the paper, pointing out where 
the above result may be improved. Note that the above theorem generalizes the graph removal 
lemma by considering the case F = Kn, which is (p, ;S)-jumbled with p = 1 and /3 = 1. For the 
same reason, the other results we establish extend the original versions. 

Green [51] developed an arithmetic regularity lemma and used it to deduce an arithmetic removal 
lemma in abelian groups which extends Roth's theorem. Green's proof of the arithmetic regularity 
lemma relies on Fourier analytic techniques. Krai, Serra and Vena |63j found a new proof of the 
arithmetic removal lemma using the removal lemma for directed cycles which extends to all groups. 
They proved that for each e > and integer m > 3 there is (5 > such that if G is a group of order 
n and Ai,. . . ,Am are subsets of G such that there are at most 6n"^~^ solutions to the equation 
xiX2 ■ ■ ■ Xm = 1 with Xi G Ai for all i, then it is possible to remove at most en elements from each 
set Ai so as to obtain sets A'^ for which there are no solutions to xiX2 ■ ■ ■ Xm = 1 with Xi G A',- for 
all i. 

By improving the bound in Theorem 11.11 for cycles, we obtain the following sparse extension 
of the removal lemma for groups. The Cayley graph G{S) of a subset of a group G has vertex 
set G and (x, y) is an edge of G if x^'^y G S. We say that a subset S of a group G is {p, /?)- 
jumbled if the Cayley graph G{S) is (p, /3)-jumbled. When G is abelian, if IX^^^gg xl^^) | < /3 for 
all nontrivial characters G — )• C, then S is (|^, /3)-jumbled (see [59l Lemma 16]). Let /cs = 3, 

/c4 = 2, = 1 + 71^—^ if m > 5 is odd, and km = ^ + 7^r-i\ if ?7i > 6 is even. Note that km tends to 
1 as m — )■ oo. 
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Theorem 1.2. For each e > and integer m > 3, there are c,6 > such that the following holds. 
Suppose Bi, . . . ,Bm are subsets of a group G of order n such that each Bi is (p, (3) -jumbled with 
P < cp^^n. If subsets Ai C Bi for i = 1, . . . ,m are such that there are at most 5\Bi\ ■ ■ ■ \Bm\/ri 
solutions to the equation xiX2 ■ ■ ■ Xm = 1 with Xi S Ai for all i, then it is possible to remove at 
most e\Bi\ elements from each set Ai so as to obtain sets A'- for which there are no solutions to 
xiX2 • • • Xm = 1 with Xi £ A'^ for all i. 

This resuh easily impUes a Roth-type theorem in quite sparse pseudorandom subsets of a 
group. We say that a subset i? of a group G is {e,m)-Roth if for ah integers ai, . . . ,Om which 
satisfy ai + • • • + Om = and gcd(aj, = 1 for 1 < i < m, every subset A C. B which has no 
nontrivial solution to x^^Xg^ • • • x^ = 1 has |yl| < e\B\. 

Corollary 1.3. For each e > and integer m > 3, there is c> such that the following holds. If G 
is a group of order n and B is a {p, (3) -jumbled subset of G with /3 < cp^"^n, then B is {e,m)-Roth. 

Note that Roth's theorem on 3-term arithmetic progressions in dense sets of integers, follows 

from the special case of this result with i? = G = Z„,m = 3, and ai = 02 = 1, 03 = — 2. The rather 

weak pseudorandomness condition in Corollary 11.31 shows that even quite sparse pseudorandom 

subsets of a group have the Roth property. For example, if B is optimally jumbled, in that 

1 

/3 = 0{^ypn) and p > Gn ^ then B is (e, m)-Roth. This somewhat resembles a key part of the 

proof of the Green- Tao theorem that the primes contain arbitrarily long arithmetic progressions, 
where they show that pseudorandom sets of integers have the Szemeredi property. As Corollary 11.31 
applies to quite sparse pseudorandom subsets, it may lead to new applications in number theory. 

Our methods are quite general and also imply similar results for other well-known combinatorial 
theorems. We say that a graph T is {H, e)-Turdn if any subgraph of T with at least 

edges contains a copy of H. Turan's theorem itself [91J, or rather a generalization known as the 
Erdos- Stone- Simonovits theorem [30j . says that Kn is {H, e)-Turan provided that n is large enough. 

To find other graphs which are {H, e)-Turan, it is natural to try the random graph Gn,p. A 
recent result of Conlon and Gowers [25], also proved independently by Schacht ^9], states that 
for every t > 3 and e > there exists a constant G such that if p > Cn~^/(*+-^) the graph Gn,p 
is with high probability (i^^, e)-Turan. This confirms a conjecture of Maxell, Kohayakawa, Luczak 
and Rodl [531 E5] and, up to the constant C, is best possible. Similar results also hold for more 
general graphs H and hyper graphs. 

For pseudorandom graphs and, in particular, (n, d, A)-graphs, Sudakov, Szabo and Vu [83] 
showed the following. A similar result, but in a slightly more general context, was proved by 
Chung [15]. 

Theorem. For every e > and every positive integer t > 3, there exists c > such that if 
A < cd^^^/n^^'^ then any {n,d, X)-graph is {Kt,e)-Turdn. 

For t = 3, an example of Alon [2] shows that this is best possible. His example gives something 
even stronger, a triangle- free (n, d, A)-graph for which A < cVd and d > n^/^. Therefore, no 
combinatorial statement about the existence of triangles as subgraphs can surpass the threshold 
A < cd'^/n. It has also been conjectured [2S1IM1IS11 that A < cd^~^/n^~'^ is a natural boundary for 
finding Kt as a subgraph in a pseudorandom graph, but no examples of such graphs exist for f > 4. 
Finding such graphs remains an important open problem on pseudorandom graphs. 

For triangle-free graphs H, Kohayakawa, Rodl, Schacht, Sissokho and Skokan [58j proved the 
following result which gives a jumbledness condition that implies that a graph is {H, e)-Turan. 
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Theorem. For any fixed triangle-free graph H and any e > 0, there exists c > such that if 
(3 < cp'^^^^n then any {p, (3) -jumbled graph on n vertices is {H,e)-Turdn. Here v{H) = ^{d{H) + 
D{H) + I), where D{H) = mm{2d{H), A{H)}. 

More recently, the case where H is an odd cycle was studied by Aigner-Horev, Han and Schacht 
[l], who proved the following result, optimal up to the logarithmic factors [6j. 

Theorem. For every odd integer £ > 3 and any e > 0, there exists c > such that if /3 log^~^ n < 
then any {p, P)- jumbled graph on n vertices is {Cg,e)-Turdn. 

In this paper, we prove that a similar result holds, but for general graphs H and, in most cases, 
with a better bound on j3. 

Theorem 1.4. For every graph H and every e > 0, there exists c > such that if f3 < cp'^'^^^^^^n 
then any {p, P)- jumbled graph on n vertices is {H,e)-Turdn. 

We may also prove a structural version of this theorem, known as a stability result. In the 
dense case, this result, due to Erdos and Simonovits [82], states that if an H-fiee graph contains 
almost ^1 — (2) sdges, then it must be very close to being {x{H) — l)-partite. 

Theorem 1.5. For every graph H and every e > 0, there exist 5 > and c > such that if 
p < cp'^^^^^+^n then any {p, /3)- jumbled graph T on n vertices has the following property. Any 
H-free subgraph ofV with at least ^1 — — ^ ^(2) ^ds^s may be made {x{H) — l)-partite by 

removing at most epn^ edges. 

The final main result that we will prove concerns Ramsey's theorem [71] • This states that for 
any graph H and positive integer r, if n is sufficiently large, any r-coloring of the edges of 
contains a monochromatic copy of H. 

To consider the analogue of this result in sparse graphs, let us say that a graph F is {H,r)- 
Ramsey if, in any r-coloring of the edges of F, there is guaranteed to be a monochromatic copy 
of H. For Gn,p, a result of Rodl and Rucihski |75] determines the threshold down to which the 
random graph is {H, r)-Ramsey with high probability. For most graphs, including the complete 
graph Kt, this threshold is the same as for the Turan property. These results were only extended to 
hypergraphs comparatively recently, by Conlon and Gowers [25] and by Friedgut, Rodl and Schacht 

m- 

Very little seems to be known about the {H, r)-Ramsey property relative to pseudorandom 
graphs. In the triangle case, results of this kind are implicit in some recent papers [28^ l67] on 
Folkman numbers, but no general theorem seems to be known. We prove the following. 

Theorem 1.6. For every graph H and every positive integer r > 2, there exists c > such that if 
(3 < cp'^'ii^)+^n then any {p, P)- jumbled graph on n vertices is {H,r)-Ramsey. 

One common element to all these results is the requirement that /3 < cp^^^^^~^^n. It is not hard 
to see that this condition is almost sharp. Consider the binomial random graph on n vertices where 
each edge is chosen with probability p = cn"^/'^^^) , where c < 1 . By the definition of degeneracy, 
there exists some subgraph H' of H such that d{H) is the minimum degree of H' . Therefore, 
&{H') > v{H')d{H)/2 and the expected number of copies of H' is at most 



7 



We conclude that with positive probability Gn,p does not contain a copy of H' 
H. On the other hand, with high probability, it is {p, /3)-jumbled with 



or, consequently, of 



/3 = O(VP^) = 0(p('^(^)+2)/4^)^ 



Since d2{H) differs from d{H) by at most a constant factor, we therefore see that, up to a multi- 
plicative constant in the exponent of p, our results are best possible. 

11 H = Kt, it is sufficient, for the various combinatorial theorems above to hold, that the graph 
r be {p, cp*n) -jumbled. For triangles, the example of Alon shows that there are {p, c]3^n)-jumbled 
graphs which do not contain any triangles and, for t > 4, it is conjectured [38t [64} [84] that there 
are (p, cp*~^n)-jumbled graphs which do not contain a copy of Kf. If true, this would imply that 
in the case of cliques all of our results are sharp up to an additive constant of one in the exponent. 
A further discussion relating to the optimal exponent of p for general graphs is in the concluding 
remarks. 

1.3 Regularity and counting lemmas 

One of the key tools in extremal graph theory is Szemeredi's regularity lemma [86]. Roughly 
speaking, this says that any graph may be partitioned into a collection of vertex subsets so that 
the bipartite graph between most pairs of vertex subsets is random-like. To be more precise, we 
introduce some notation. It will be to our advantage to be quite general from the outset. 

A weighted graph on a set of vertices ^ is a symmetric function G: V x V ^ [0,1]. Here 
symmetric means that G{x,y) = G{y,x). A weighted graph is bipartite (or multipartite) if it is 
supported on the edge set of a bipartite (or multipartite graph). A graph can be viewed as a 
weighted graph by taking G to be the characteristic function of the edges. 

Note that here and throughout the remainder of the paper, we will use integral notation for 
summing over vertices in a graph. For example, if G is a bipartite graph with vertex sets X and 
Y, and / is any function X x Y ^ M, then we write 



The measure dx will always denote the uniform probability distribution on X. The advantage of 
the integral notation is that we do not need to keep track of the number of vertices in G. All our 
formulas are, in some sense, scale-free with respect to the order of G. Consequently, our results 
also have natural extensions to graph limits |66j, although we do not explore this direction here. 

Definition 1.7 (DISC). A weighted bipartite graph G: X x Y ^ [0,1] is said to satisfy the 
discrepancy condition DISC(g, e) if 



for all functions n: X — t- [0, 1] and v: Y ^ [0,1]. In any weighted graph G, if X and Y are subsets 
of vertices of G, we say that the pair {X, Y)g satisfies DISC(g, e) if the induced weighted graph on 
X xY satisfies DISC(g,e). 

The usual definition for discrepancy of an (unweighted) bipartite graph G is that for all X' X, 
Y' C Y, we have \e{X',Y') —q\X'\ \ Y'\\ < e\X\ \ Y\. It is not hard to see that the two notions of 
discrepancy are equivalent (with the same e). 





(1) 
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A partition V{G) = Vi U • • • U Vk is said to be equitable if all pieces in the partition are of 
comparable size, that is, if ||Vi| — < 1 for all i and j. Szemeredi's regularity lemma now says 
the following. 

Theorem (Szemeredi's regularity lemma). For every e > and every positive integer niQ, there 
exists a positive integer M such that any weighted graph G has an equitable partition into k pieces 
with mo < k < M such that all but at most ek"^ pairs of vertex subsets {Vi,Vj) satisfy DISC(qjj,e) 
for some q-ij. 

On its own, the regularity lemma would be an interesting result. But what really makes it so 
powerful is the fact that the discrepancy condition allows us to count small subgraphs. In particular, 
we have the following result, known as a counting lemma. 

Proposition 1.8 (Counting lemma in dense graphs). Let G be a weighted m-partite graph with 
vertex subsets Xi,X2, ■ ■ ■ ,Xm- Let H be a graph with vertex set {1, . . . ,m} and with e{H) edges. 
For each edge {i,j) in H, assume that the induced bipartite graph G{Xi,Xj) satisfies DlSC{qij,e). 
Define 

G{H) := / G{xi,Xj) dxi- ■ ■ dxm 

AieXi,...,x„gx„ ^.^.^^^^^^ 

and 

q{H) := J] q,j. 

{i,j)eE{H) 

Then 

\G{H)-q{H)\<e{H)e. 

The above result, for an unweighted graph, is usually stated in the following equivalent way: 
the number of embeddings of H into G, where the vertex i G y{L[) lands in Xi, differs from 
IX2I • • • \Xm\ Yl(i j)^E(H) lij by most e{H)e \Xi\ \X2\ ■ ■ ■ \Xm\- Our notation G{H) can be 
viewed as the probability that a random embedding of vertices of I{ into their corresponding parts 
in G gives rise to a valid embedding as a subgraph. 

Proposition 11.81 mav be proven by telescoping (see, e.g., Theorem 2.7 in [12j). Consider, for 
example, the case where H is a, triangle. Then G(xi, X2)G(xi, X3)G(x2, X3) — gi2^Zi3'723 may be 
rewritten as 

(G(xi, X2) - gi2)G(xi, X3)G(x2, X3) + q'i2(G(xi, X3) - gi3)G(2;2, X3) + qi2qiz{G{x2,x^) - q23)- (2) 

Applying the discrepancy condition ([1]), we see that, after integrating the above expression over all 
xi G Xi,X2 G X2,X3 S X3, each term in ([2]) is at most e in absolute value. The result follows for 
triangles. The general case follows similarly. 

In order to prove extremal results in sparse graphs, we would like to transfer some of this 
machinery to the sparse setting. Because the number of copies of a subgraph in a sparse graph G is 
small, the error between the expected count and the actual count must also be small for a counting 
lemma to be meaningful. Another way to put this is that we aim to achieve a small multiplicative 
error in our count. 

Since we require smaller errors when counting in sparse graphs, we need stronger discrepancy 
hypotheses. In the following definition, we should view p as the order of magnitude density of the 
graph, so that the error terms should be bounded in the same order of magnitude. In a dense 
graph, p = 1. We assume that q < p- It may be helpful to think of q/p as bounded below by some 
positive constant, although this is not strictly required. 
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Definition 1.9 (DISC). A weighted bipartite graph G: XxF — )- [0, 1] is said to satisfy DlSC{q,p, e) 
if 



{G{x,y) - q)u{x)v{y) dxdy 



< 



ep 



for all functions u: X — )■ [0, 1] and v: 1" — )■ [0, 1]. 

Unfortunately, discrepancy alone is not strong enough for counting in sparse graphs. Consider 
the following example. Let G be a tripartite graph with vertex sets Xi, X2, X^, such that (Xi,X2)g 
and (X2,X3)g satisfy DISC(g,p, |). Let X2 be a subset of X2 with size |p \X2\- Let G' be modified 
from G by adding the complete bipartite graph between Xi and X2, as well as the complete bipartite 
graph between X2 and X3. The resulting pairs {Xi,X2)g' and {X2,X3)q' satisfy DlSC{q,p,€). 
Consider the number of paths in G and G' with one vertex from each of Xi, X2, X3 in turn. Given 
the densities, we expect there to be approximately \Xi\ \X2\ IX3I such paths, and we would like 
the error to be Sp"^ \X2\ {X-^l for some small 6 that goes to zero as e goes to zero. However, the 
number of paths in G' from Xi to X2 to X3 is |p IX2I IX3I , which is already too large when p 
is small. 

For our counting lemma to work, G needs to be a relatively dense subgraph of a much more 
pseudorandom host graph T. In the dense case, F can be the complete graph. In the sparse world, 
we require F to satisfy the jumbledness condition. In practice, we will use the following equivalent 
definition. The equivalence follows by considering random subsets of X and Y, where x and y are 
chosen with probabilities u{x) and v{y), respectively. 



Definition 1.10 (Jumbledness). A bipartite graph F = (X U Y, Ey) is (p, 7Y>^^TT^)"jn™bled if 



x&X 
y&Y 



{T{x,y) — p)u{x)v{y) dxdy 



<1\ 



u{x) dx. 



lyeY 



v{y) dy 



(3) 



for all functions u: X — )■ [0, 1] and v: 1" — )• [0, 1]. 



With the discrepancy condition defined as in Definition 11.91 we may now state a regularity 
lemma for sparse graphs. Such a lemma was originally proved independently by Kohayakawa [54J 
and by Rodl (see also [lUlHO]). The following result, tailored specifically to jumbled graphs, follows 
easily from the main result in 



Theorem 1.11 (Regularity lemma in jumbled graphs). For every e > and every positive integer 
mo, there exists rj > and a positive integer M such that if T is a {p,ripn) -jumbled graph on n 
vertices any weighted subgraph GofT has an equitable partition into k pieces with tuq < k < M 
such that all but at most ek'^ pairs of vertex subsets {Vi, Vj) satisfy DlSC{qij,p, e) for some qij. 

The main result of this paper is a counting lemma which complements this regularity lemma. 
Proving such an embedding lemma has remained an important open problem ever since Kohayakawa 
and Rodl first proved the sparse regularity lemma. Most of the work has focused on applying the 
sparse regularity lemma in the context of random graphs. The key conjecture in this case, known 
as the KLR conjecture, concerns the probability threshold down to which a random graph is, with 
high probability, such that any regular subgraph contains a copy of a particular subgraph H. This 
conjecture has only been resolved very recently [101 126] . For pseudorandom graphs, it has been a 
wide open problem to prove a counting lemma which complements the sparse regularity lemma. 
The first progress on proving such a counting lemma was made recently in [59] , where they proved 
a counting lemma for triangles. Here, we prove a counting lemma which works for any graph H. 
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Even for triangles, our counting lemma gives an improvement over the results in |59j . since our 
results have polynomial-type dependence on the discrepancy parameters, whereas the results in 
|59j require exponential dependence since a weak regularity lemma was used as an immediate step 
during their proof of the triangle counting lemma. 

Our results are also related to the work of Green and Tao [52] on arithmetic progressions in 
the primes. What they really prove is the stronger result that Szemeredi's theorem on arithmetic 
progressions holds in subsets of the primes. In order to do this, they first show that the primes, 
or rather the almost primes, are a pseudorandom subset of the integers and then that Szemeredi's 
theorem continues to hold relative to such pseudorandom sets. In the language of their paper, our 
counting lemma is a generalized von Neumann theorem. 

Here is the statement of our first counting lemma. Note that, given a graph the line graph 
L{H) is the graph whose vertices are the edges of H and where two vertices are adjacent if their 
corresponding edges in H share an endpoint. Recall that d{-) is the degeneracy and A(-) is the 
maximum degree. 

Theorem 1.12. Let H he a graph with vertex set {1, . . . , m} and with e{H) edges. For every > 0, 
there exist c, e > of size at least polynomial in 9 so that the following holds. 

Let p > and let T he a graph with vertex suhsets Xi, . . . ,Xm and suppose that the hipartite 
graph (Xj , Xj )r is {p, cp^ y^\Xi]\Xj\) -jumhled for every i < j, where k > min | ^ d{L{H))+G | ^ 

Let G he a suhgraph of T, with the vertex i of H assigned to the vertex suhset Xi of G. For each 
edge ij in H, assume that (Xi,Xj)G satisfies DISC (qij,p,e). Define 

G{H) := / Y\ G{xi,Xj) dxi- ■ ■ dxm 

and 

q{H) := Yl q,j. 

{i,j)eE{H) 

Then 

\G{H)-q{H)\ < ep<^\ 

For some graphs H, our methods allow us to achieve slightly better values of k in Theorem 
11.121 However, the value given in the theorem is the cleanest general statement. See Table [1] for 
some example of hypotheses on k for various graphs H. To see that the value of k is never far from 
best possible, we first note that A{H) - 1 < d{L{H)) < A{H) + d{H) - 2. 

Let H have maximum degree A. By considering the random graph Gri,p with p = n~^/^, we can 
find a (p, cp^/^n)-jumbled graph F containing approximately p^(^)n^(^) labeled copies of H. We 
modify F to form F' by fixing one vertex v and connecting it to everything else. It is easy to check 
that the resulting graph F' is {p, c'p'^/^n)-jumbled. However, the number of copies of H disagrees 
with the expected count, since there are approximately p'^i^^n^^^^ labeled copies from the original 
graph F and a further approximately p^^^)^^n'"^^^~^ = p^^^^n"^^^ labeled copies containing v. We 
conclude that for k < A/2 we cannot hope to have such a counting lemma and, therefore, the value 
of k in Theorem 11.121 is close to optimal. 

Since we are dealing with sparse graphs, the discrepancy condition 11.91 appears, at first sight, 
to be rather weak. Suppose, for instance, that we have a sparse graph satisfying DISC(g,p, e) 
between each pair of sets from Vi , V2 and V3 and we wish to embed a triangle between the three 
sets. Then, a typical vertex v in Vi will have neighborhoods of size roughly q\V2\ and q\Vs\ in V2 
and V3, respectively. But now the condition DlSC{q,p,e) tells us nothing about the density of 
edges between the two neighborhoods. They are simply too small. 
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Table 1: Sufficient conditions on k in the jumbledness liypothesis {p, cp^ ^\Xi\ \Xj\) for the counting 
lemmas of various graphs. Two-sided counting refers to results of the form |G(-ff) — < Qp^^^^ 

while one-sided counting refers to result of the form G{H) > q{H) — 6p^^^\ 

Two-sided counting One-sided counting 
H k> k> 

Kt t t t>3 

Ce 2 2 £ = A 

K2,t ^ i t>3 

Ks,t ^ 3<s<t 

Tree MH)+i No jumbledness needed See Prop. 14.91 and 17.71 

2,2 4 4 



To get around this, Gerke, Kohayakawa, Rodl and Steger 02j showed that if (X, Y) is a pair 
satisfying DISC(g,p, e) then, with overwhelmingly high probability, a small randomly chosen pair 
of subsets X' <^ X and Y' Y will satisfy T)lSC{q,p,e'), where e' tends to zero with e. We say 
that the pair inherits regularity. This may be applied effectively to prove embedding lemmas in 
random graphs (see, for example, |43l I60j ). For pseudorandom graphs, the beginnings of such an 
approach may be found in [M] . 

Our approach in this paper works in the opposite direction. Rather than using the inheritance 
property to aid us in proving counting lemmas, we first show how one may prove the counting 
lemma and then use it to prove a strong form of inheritance in jumbled graphs. For example, we 
have the following theorem. 

Proposition 1.13. For any a > 0, ^ > and e' > 0, there exists c > and e > of size at least 
polynomial in a, ^, e' such that the following holds. 

Let p G (0, 1] and qxY,Qxz,QYZ £ Let T be a tripartite graph with vertex sets X, Y and 

Z and G he a subgraph of F . Suppose that 

• {X,Y)y is (p, cp^y^|X| \Y\)-iumhled and {X,Y)g satisfies DISC(g'xy e); and 

• {X,Z)r is {p, cp"^ y/\X\ \ Z\)-jumbled and {X,Z)g satisfies DlSC{qxz,P,^); and 

• (y, Z)r is {p,cp^y^\Y\ \Z\) -jumbled and {Y,Z)q satisfies DISC(gyz,p, e). 

Then at least (1 — ^) \Z\ vertices z ^ Z have the property that \Nx{z)\ > — S,)Qxz \X\, \Ny{z)\ > 
{1-0qyz\Y\, and {Nx{z),Ny{z))g satisfies DISC(gxy,P, e')- 

The question now arises as to why one would prove that the inheritance property holds if 
we already know its intended consequence. Surprisingly, there is another counting lemma, giving 
only a lower bound on G{H), which is sufficient to establish the various extremal results but 
typically requires a much weaker jumbledness assumption. The proof of this statement relies on the 
inheritance property in a critical way. The notations G{H) and q{H) were defined in Theorem ll.121 

Theorem 1.14. For every fixed graph H on vertex set {1, 2, . . . , m} and every a, > 0, there exist 
constants c > and e > such that the following holds. 
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Let T be a graph with vertex subsets Xi, . . . ,Xm and suppose that the bipartite graph (Xj,Xj)r 
is {p, cp^^ I I \Xj\) -jumbled for every i < j with ij G E{H). Let G be a subgraph ofT, 
with the vertex i of H assigned to the vertex subset Xi of G. For each edge ij of H , assume that 
{Xi,Xj)Q satisfies DISC {qij,p,e), where ap < qij < p. Then 

G{H)>{l-e)q{H). 

We refer to Theorem 11.141 as a one-sided counting lemma, as we get a lower bound for G{H) 
but no upper bound. However, in order to prove the theorems of Section 11.21 we only need a lower 
bound. The proof of Theorem 11.141 is a sparse version of a classical embedding strategy in regular 
graphs (see, for example, [20l dOl [50] ) . Note that, as in the theorems of Section [L2t the exponent 
d2{H) + 3 can be improved for certain graphs H. We will say more about this later. Moreover, 
one cannot hope to do better than (5 = 0(p('^(^)"'"^)/*^n), so that the condition on (5 is sharp up to 
a multiplicative constant in the exponent of p. We suspect that the exponent may even be sharp 
up to an additive constant. 

Organization. We will begin in the next section by giving a high level overview of the proof 
of our counting lemmas. In Section [3l we prove some useful statements about counting in the 
pseudorandom graph F. Then, in Section [U we prove the sparse counting lemma, Theorem 11.121 
The short proof of Proposition 11.131 and some related propositions about inheritance are given 
in Section [5l The proof of the one-sided counting lemma, which uses inheritance, is then given 
in Section [6l In Section [71 we take a closer look at one-sided counting in cycles. The sparse 
counting lemma has a large number of applications extending many classical results to the sparse 
setting. In Section [8l we discuss a number of them in detail, including sparse extensions of the 
Erdos-Stone-Simonovits theorem, Ramsey's theorem, the graph removal lemma and the removal 
lemma for groups. In Section [9] we briefly discuss a number of other applications, such as relative 
quasirandomness, induced Ramsey numbers, algorithmic applications and multiplicity results. 

2 Counting strategy 

In this section, we give a general overview of our approach to counting. There are two types 
of counting results: two-sided counting and one-sided counting. Two-sided counting refers to 
results of the form \G{H) — q{H)\ < 9p^^^^ while one-sided counting refers to results of the form 
G{H) > q{H) — 9p^^^\ One-sided counting is always implied by two-sided counting, although 
sometimes we are able to obtain one-sided counting results under weaker hypotheses. 

2.1 Two-sided counting 

There are two main ingredients to the proof: doubling and densification. These two procedures 
reduce the problem of counting embeddings of H to the same problem for some other graphs H' . 

If a G V{H), the graph H with a doubled, denoted Hax2, is the graph created from V{H) by 
adding a new vertex a' whose neighbors are precisely the neighbors of a. In the assignment of 
vertices of Hax2 to vertex subsets of T, the new vertex a' is assigned to the same vertex subset of 
r. For example, the following figure shows a triangle with a vertex doubled. 
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A typical reduction using doubling is summarized in Figure [TJ Each graph represents the claim 
that the number of embeddings of the graph drawn, where the straight edges must land in G and 
the wavy edges must land in T, is approximately what one would expect from multiplying together 
the appropriate edge densities between the vertex subsets of G and T. 

The top arrow in Figure [1] is the doubling step. This allows us to reduce the problem of counting 
H to that of counting a number of other graphs, each of which may have some edges which embed 
into G and some which embed into F. For example, if we let H-a be the graph that we get by 
omitting every edge which is connected to a particular vertex o, we are interested in the number 
of copies of H^a in both G and F. We are also interested in the original graph but now on the 
understanding that the edges incident with o embed into G while those that do not touch a embed 
into F. Finally, we are interested in the graph Hax2 formed by doubling the vertex a, but again 
the edges which do not touch a or its copy a' only have to embed into F. This reduction, which is 
justified by an application of the Cauchy-Schwarz inequality, will be detailed in Section 14.11 

The bottom two arrows in Figure [1] are representative of another reduction, where we can reduce 
the problem of counting a particular graph, with edges that map to both G and F, into one where 
we only care about the edges that embed into G. We can make such a reduction because counting 
embeddings into F is much easier due to its jumbledness. We will discuss this reduction, amongst 
other properties of jumbled graphs F, in Section [3l 




G{H) 




1[ -it 




G{Ha,ax2) G{Ha) 

Figure 1: The doubling reduction. Each graph represents some counting lemma. The straight edges 
must embed into G while wavy edges must embed into the jumbled graph F. 

For triangles, a similar reduction is shown in Figure [2j In the end, we have changed the task of 
counting triangles to the task of counting the number of cycles of length 4. It would be natural now 
to apply doublng to the 4-cycle but, unfortunately, this process is circular. Instead, we introduce 
an alternative reduction process which we refer to as densification. 
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Figure 2: The doubling reduction for counting triangles. 



In the above reduction from triangles to 4-cycles, two of the vertices of the 4-cycle are embedded 
into the same part Xi of G. We actually consider the more general setting where the vertices of 
the 4-cycle lie in different parts, Xi, X2, X^, X4, of G. 

Assume without loss of generality that there is no edge between Xi and X3 in G. Let us 
add a weighted graph between Xi and X3, where the weight on the edge X1X3 is proportional 
to the number of paths xix^x^ for X4 € X4. Since {Xi,Xi)G and {X3,Xi)G satisfy discrepancy, 
the number of paths will be on the order of gi4g34 IX4I for most pairs (xijX^). After discarding 
a negligible set of pairs (xi,X3) that give too many paths, and then appropriately rescaling the 
weights of the other edges X1X3, we create a weighted bipartite graph between Xi and X3 that 
behaves like a dense weighted graph satisfying discrepancy. Furthermore, counting 4-cycles in 
Xi, X2, X3, X4 is equivalent to counting triangles in Xi,X2,X3 due to the choice of weights. We 
call this process densification. It is illustrated below. In the figure, a thick edge signifies that the 
bipartite graph that it embeds into is dense. 




More generally, if 6 is a vertex of H of degree 2, with neighbors {a, c}, such that a and c are not 
adjacent, then densification allows us to transform H by removing the edges ab and be and adding 
a dense edge ac, as illustrated below. For more on this process, we refer the reader to Section [4.2i 



b 




We needed to count 4-cycles in order to count triangles, so it seems at first as if our reduction 
from 4-cycles to triangles is circular. However, instead of counting triangles in a sparse graph, we 
now have a dense bipartite graph between one of the pairs of vertex subsets. Since it is easier to 
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Figure 3: The doubling reduction for triangles with one dense edge. 



count in dense graphs than in sparse graphs, we have made progress. The next step is to do doubling 
again. This is shown in Figure [3l The bottommost arrow is another application of densification. 

We have therefore reduced the problem of counting triangles in a sparse graph to that of counting 
triangles in a dense weighted graph, which we already know how to do. This completes the counting 
lemma for 4-cycles. 

In Figure [H doubling reduces counting in a general H to counting H with one vertex deleted 
(which we handle by induction) as well as graphs of the form Ki^t and 1^2, t- Trees like Ki^t are not 
too hard to count. It therefore remains to count K2^f As with counting C4 (the case t = 2), we 
first perform a densification. 



The graph on the right can be counted using doubling and induction, as shown in Figure HI Note 
that the C4 count is required as an input to this step. This then completes the proof of the counting 
lemma. 

2.2 One-sided counting 

For one-sided counting, we embed the vertices of H into those of G one at a time. By making a 
choice for where a vertex a of H lands in G, we shrink the set of possible targets for each neighbor 
of a. These target sets shrink by a factor roughly corresponding to the edge densities of G, as most 
vertices of G have close to the expected number of neighbors due to discrepancy. This allows us to 
obtain a lower bound on the number of embeddings of H into G. 

The above argument is missing one important ingredient. When we shrink the set of possible 
targets of vertices in H, we do not know if G restricted to these smaller vertex subsets still satisfies 
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induction 




Figure 4: The doubling reduction for counting K2^t- 

the discrepancy condition, which is needed for embedding later vertices. When G is dense, this is 
not an issue, since the restricted vertex subsets have size at least a constant factor of the original 
vertex subsets, and thus discrepancy is inherited. When G is sparse, the restricted vertex subsets 
can become much smaller than the original vertex subsets, so discrepancy is not automatically 
inherited. 

To address this issue, we observe that discrepancy between two vertex sets follows from some 
variant of the K2^2 count (and the counting lemma shows that they are in fact equivalent). By our 
counting lemma, we also know that the graph below has roughly the expected count. This in turn 
implies that discrepancy is inherited in the neighborhoods of G since, roughly speaking, it implies 
that almost every vertex has roughly the expected number of 4-cycles in its neighborhood. The one- 
sided counting approach sketched above then carries through. For further details on inheritance of 
discrepancy, see Section [5l The proof of the one-sided counting lemma may be found in Section [6j 




We also prove a one-sided counting lemma for large cycles using much weaker jumbledness 
hypotheses. The idea is to extend densification to more than two edges at a time. We will show 
how to transform a multiply subdivided edge into a single dense edge, as illustrated below. 




Starting with a long cycle, we can perform two such densifications, as shown below. The resulting 
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triangle is easy to count, since a typical embedding of the top vertex gives a linear-sized neighbor- 
hood. The full details may be found in Section [71 




A 



3 Counting in F 



In this section, we develop some tools for counting in T. Here is the setup for this section. 

Setup 3.1. Let F be a graph with vertex subsets Xi, . . . , Xm- Let p,c G (0, 1] and k > 1. Let H 
be a graph with vertex set {1, . . . , m}, with vertex a assigned to Xa- For every edge ab in H, one 
of the following two holds: 



• {Xa,Xi,)r is {p, cp^ y/\Xa\ I Xfe I) -jumbled, in which case we set Pab = P and say that ab is a 
sparse edge, or 

• {Xa,Xb)r is a complete bipartite graph, in which case we set pab = 1 and say that ab is a 
dense edge. 

Let H^^ denote the subgraph of H consisting of sparse edges. 
3.1 Example: counting triangles in F 

We start by showing, as an example, how to prove the counting lemma in F for triangles. Most of 
the ideas found in the rest of this section can already be found in this special case. 

Proposition 3.2. Assume Setup HOI Let H be a triangle with vertices {1,2,3}. Assume that 
k>2. Then \r{H) - p^\ < bcp^ . 

Proof. In the following integrals, we assume that x,y and z vary uniformly over X\,X2 and X3, 
respectively. We have the telescoping sum 



T{H) -p^ = {T{x, y) - p)T{x, z)T{y, z) dxdydz 

J x,y,z 



+ / P{^{x, z) — p)T{y, z) dxdydz + p {T{y, z) — p) dxdydz. (4) 

J x,y,z J x,y,z 

The third integral on the right-hand side of (j3]) is bounded in absolute value by cp^ by the jum- 
bledness of F. In particular, this implies that Jy_^T(y,z) dydz < (1 + c)p. Similarly we have 
^ T{x, z) dxdz < (1 + c)p. Using ([3]), the second integral is bounded in absolute value by 



^ cp^J jr{y, z) dz dy <cp^Jj^ F(y, z) dydz < c^{l + c)pp^ 



Finally, the first integral on the right-hand side of ([4]) is bounded in absolute value by, using ([3]) 
and the Cauchy-Schwarz inequality. 



(5) 
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Therefore, (jl]) is bounded in absolute value by 5cp^. 



□ 



Remark. (1) In the more general proof, the step corresponding to ([5]) will be slightly different but 
is similar in its application of the Cauchy-Schwarz inequality. 

(2) The proof shows that we do not need the full strength of the jumbledness everywhere — we 
only need {p, cp^^"^ ^/\X\ |Z|)-jumbledness for {X,Z)r and (p, cp-y/|y[T^) "Jumbledness for {Y,Z)r- 
In Section [6l it will be useful to have a counting lemma with such non-balanced jumbledness 
assumptions in order to optimize our result. To keep things simple and clear, we will assume 
balanced jumbledness conditions here and remark later on the changes needed when we wish to 
have optimal non-balanced ones. 

3.2 Notation 

In the proof of the counting lemmas we often meet expressions such as G{xi,X2)G{xi,X3)G{x2,X3) 
and their integrals. We introduce some compact notation for such products and integrals. Note 
that if we are counting copies of H, we will usually assign each vertex a of to some vertex subset 
Xa and we will only be interested in counting those embeddings where each vertex of H is mapped 
into the vertex subset assigned to it. If [/ C V{H), a map U — >• V{G) or f7 — )■ V{T) is called 
compatible if each vertex of U gets mapped into the vertex set assigned to it. We can usually 
assume without loss of generality that the vertex subsets Xa are disjoint for different vertices of H, 
as we can always create a new multipartite graph with disjoint vertex subsets Xa with the same 
iJ-embedding counts as the original graph. 

If / is a symmetric function on pairs of vertices of G (actually we only care about its values on 
Xa X Xi, for ab £ E{H)) and x: V{H) — t- V{G) is any compatible map (we write x(a) = x^), then 
we define 

/(F|x):= W f{Xa,Xb). 
ab£E{H) 

By taking the expectation as x varies uniformly over all compatible maps V{H) — )• V{G), we can 
define the value of a function on a graph. 

f{H) :=E,[/(F|x)]= [ f{H\^) dx. 

We shall always assume that the measure dx is the uniform probability measure on compatible 
maps. 

For unweighted graphs, we use G and F to denote also the characteristic function of the edge set 
of the graph, so that G{H) is the probability that a uniformly random compatible map V{H) — t- 
V{G) is a graph homomorphism from H to G. For weighted graphs, the value on the edges are the 
edge weights. For counting lemmas, we are interested in comparing G{H) with q{H), which comes 
from setting q{xa,Xh) to be some constant Qab for each ab S E{H). 

It will be useful to have some notation for the conditional sum of a function / given that some 
vertices have been fixed. If [/ C V{H) and y:U^ ^{G) is any compatible map, then 

f{H\y) :=Ex[/(//|x)|x|t; = y]= [ f{H\y,z) dz, 

where, in the integral, z varies uniformly over all compatible maps V{H) \ f7 — )• V{G), and the 
notation y, z denotes the compatible map V{H) — )• V{G) built from combining y and z. Note 
that when [/ = 0, / (i? | y) = f{H). When U = V{H), the two definitions of / (i^ | y) agree. 
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When U = {ai, . . . ,at}, we sometimes write y as oi — )• yi, at — )• yt, so we can write 
f {H \ ai ^ yi,. . . ,at ^ yt). 

Since we work with approximations frequently, it wih be convenient if we introduce some short- 
hand. If A, B, P are three quantities, we write 

A^B 

c,e 

to mean that for every 9 > 0, we can find c, e > of size at least polynomial in 9 (i.e., c, e > Q{9^) 
as — )• for some r > 0) so that |A — i?| < 9P. Sometimes one of c or e is omitted from the ~ 
notation if 9 does not depend on the parameter. Note that the dependencies do not depend on 
the parameters p and q, but may depend on the graphs to be embedded, e.g., H. For instance, a 
counting lemma can be phrased in the form 

p(H) 



3.3 Counting graphs in V 



We begin by giving a counting lemma in F, which is significantly easier than counting in G. We 
remark that a similar counting lemma for F an (n, d. A) regular graph was proven by Alon (see [64} 
Thm. 4.10]). 



Proposition 3.3. Assume Setup\3^ If k > ^WIZMi^ then 



\riH)-piH)\< (l + c 



l)piH). 



Ne(/i'=P) 



The exact coefficient of p{H) in the bound is not important. Any bound of the form 0{c)p{H) 
suffices. 

Dense edges play no role, so it suffices to consider the case when all edges of H are sparse. We 
prove Proposition 13.31 bv iteratively applying the following inequality. 

Lemma 3.4. Let H he a graph with vertex set {1, . . . ,m}. Let V he a graph with vertex suhsets 
Xi, . . . ,Xm- Let ah G E{H). Let H-ab denote H with the edge ah removed. Let H^a-b denote H 



with all edges incident to a or h removed. Assume that T{Xa,Xh) is {p,^y^\Xa\ \Xb\)-jumhled. Let 
f : V{T) X V{T) — )• [0, 1] he any symmetric function. Then 



/ ^ ^ (r(x, y) - p)f {H-ab \a^ x,b^y) dxdy 



<^Jf{H_ab)f{H_a,.b). 



Proof. Let Ha,-ab denote the edges of H-ab incident to a, and let Hi, _ab be the edges of H_ab 
incident to h. Then H^ab = H^a-b W Ha-ab W Hb^^ab, as a disjoint union of edges. In the following 
calculation, z varies uniformly over compatible maps V{H) \ {a,b} — )• V{T), x varies uniformly 
over Xa, and y varies uniformly over Xi,. The three inequalities that appear in the calculation 
follow from, in order, the triangle inequality, the jumbledness condition, and the Cauchy-Schwarz 
inequality. 



(F(x, y) - p)f {H-ab \ a ^ x,b ^ y) dxdy 



x,y 



/ / {H-a,^b I z) / (r(x, y) - p)f {Ha-ab | a X, z) / {Hb-ab I 6 y, z) dxdydz 
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< / f{H_a-b\z) / {T{x,y) - p)f {Ha-ab \ a x,z) f {Hb^^abl b y,z) dxdy 

J z J x.y 

< j f {H-a-b I z) 7^y" / {Ha-ab | a ^ X, z) dx ^ j f {Hb,^ab \ b^y,z) dydz 

= 7 / / {H-a-b I z) \/ f {Ha-ab \z)^J f {Hb-ab I z)(iz 

Z 

< 7y y / {H-a-b I z) dzW y / {H-a-b I z) / {Ha-ab \ z) / {Hb-ab \ z) dz 



dz 



lJf{H-a,-b)f{H- 



ab) 



□ 



Proof of Proposition \3.S\ As remarked after the statement of the proposition, it suffices to prove the 
result in the case when ah edges of H are sparse. We induct on the number of edges of H. If H has no 
edges, then V{H) = p{H) = 1. So assume that H has at least one edge. Since k > \{d{L{H)) + 2), 
we can find an edge ah of H such that degjj{a) + degj:^(6) < d{L{H)) + 2 < 2k. Let H_ab and 
H-a-b be as in Lemma 13.41 Since L{H) is {2k — 2)-degenerate, the line graph of any subgraph 
of H is also {2k — 2)-degenerate. By the induction hypothesis, we have \T{H-ab) — p{H-ab)\ < 
((1 + c)^(^)-^ - l)p{H-ab) and \T{H-a,-b) - p{H-a,-b)\ < ((1 + c)^(^)-i - l)p{H-a,-b). We have 

T{H) - p{H) = p ■ {T{H_ab) - p{H_ab)) + !^ (r(x, y) - p)T {H_ab \a^x,h^y) dxdy. 

yeXt 

The first term on the right is bounded in absolute value by ((1 + cY^^^~^ — l)p(H). For the second 
term, by Lemma 13.41 and the induction hypothesis, we have 



/ (r(x, y) - p)T {H-ab \ a ^ x,b^ y) dxdy 

Jx,y 



< Cp\ T{H-ab)T{H_a,-b) 



< CpHl + cf ''^-^^p{H-ab)p{H-a,-b) 

< c{l + c)<^^-^p{H). 



(6) 



The last inequality is where we used 2k > degjj{a) + degfj{b). Combining the two estimates gives 
the desired result. □ 



3.4 Counting partial embeddings into F 

As outlined in Section [2l we need to count embeddings of H where some edges are embedded into 
G (the straight edges in the figures) and some edges are embedded into F (the wavy edges). We 
prove counting estimates for these embeddings here. The main result of this section is summarized 
in the figure below. The proofs are almost identical to that of Proposition 13.31 We just need to be 
a little more careful with the exponents on the jumbledness parameter. 
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First we consider the case where exactly one edge needs to be embedded into T and the other 
edges are embedded into some subgraph of F. To state the result requires a little notation. Suppose 
that H = H' L) H" is an edge disjoint partition of the graph H into two subgraphs H' and H" . 
We define d{L{H' , H")) to be the smallest d such there is an ordering of the edges of H with the 
edges of H' occurring before the edges of H" such that every edge e has at most d neighbors, that 
is, edges containing either of the endpoints of e, which appear earlier in the ordering. 

Lemma 3.5. Assume Setup Let ah £ E{H) and H^ab be the graph H with edge ab removed. 

Assume fc> Let G he any weighted subgraph ofT (i.e., < G < F as functions). 

Let g denote the function that agrees with F on Xa x Xf, and with G everywhere else. Then 

\g{H) - PatG{H^ab)\ < c(l + c)^(^^')"V(^). 

The lemma follows from essentially the same calculation as except that we take ab as our 
first edge to remove (this is why there is a stronger requirement on k) and then use G < F. 

Iterating the lemma, we obtain the following result where multiple edges need to be embedded 
into F. It can be proved by iterating Lemma 13.51 or mimicking the proof of Proposition! 



Lemma 3.6. Assume Setup [3A[ Let H' be a subgraph of H. Assume k > ^HdH — '^^^ ^ — ))±E_ 
Let G be a weighted subgraph of F. Let g be a function that agrees with F on Xa x Xf, when 
ab G E{H \ H') and with G otherwise. Then 

\g{H)-p{H\H')G{H')\ < (^(1 + c)<^'^^ - l) p{H). 
3.5 Exceptional sets 

This section contains a couple of lemmas about F that we will need later on. The reader may 
choose to skip this section until the results are needed. 

We begin with a standard estimate for the number of vertices in a jumbled graph whose degrees 
deviate from the expected value. The proof follows immediately from the definition of jumbledness. 



Lemma 3.7. Let T be a {p, ^ y^\X\\Y\) -jumbled graph between vertex subsets X and Y. Let v : Y 
[0, 1] and let ^ > 0. // 



U C ixGX 



or 
then 



[ r{x,y)v{y) dy>{l + Op^v] 
JyeY ) 

r{x,y)v{y) dy <{l- Cip^v 



y&Y 



\x\ - ^VEw 

The next lemma says that restrictions of the count F(i/) to small sets of vertices or pairs 
of vertices yield small counts. This will be used in Section 14.21 to bound the contributions from 
exceptional sets. 

Lemma 3.8. Assume Setup CO] with k > ^ ^^ei x: V{H) — )■ V'(F) vary uniformly over 

compatible maps. Let u: V{T) — )• [0, 1] be any function and write u{x) = Y\aeV{H) ^(^a)- E' be 
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a weighted graph with the same vertices as T whose edge set is supported on Xa x Xh for ab ^ H^^ . 
Let H' be any graph with the same vertices as H . Then 

jr{H\x) u{x)E' [H' I x) dx < {{I + cY^"'"^ ~ ^ + / I P^-^^- 

Lemma 13.81 follows by showing that 

J {r{H\^)- p{H)) u{^)E' {H' I x) dx < ((1 + c)^(^'') - l) p{H). 

The proof is similar to that of Proposition 13. 31 In the step analogous to dH), after applying the 
jumbledness condition as our first inequality, we bound u and E' by 1 and then continue exactly 
the same way. 



4 Counting in G 

In this section we develop the counting lemma for subgraphs G of F, as outlined in Section [2l The 
two key ingredients are doubling and densification, which are discussed in Sections 14.11 and 14.21 
respectively. Here is the common setup for this section. 

Setup 4.1. Assume Setup [3Tl Let e > 0. Let G be a weighted subgraph of F. For every edge 
ab £ E(H), assume that {Xa,Xij)G satisfies DlSC{qab,Pab,^)^ where < g^;, < Pab- 

Unlike in Section [3l we do not make an effort to keep track of the unimportant coefficients of 
p{H) in the error bounds, as it would be cumbersome to do so. Instead, we use the ~ notation 
introduced in Section [3^21 

The goal of this section is to prove the following counting lemma. This is slightly more general 
than Theorem 11.121 in that it allows H to have both sparse and dense edges. 

Proposition 4.2. Assume Setup with k > min | '^^^^^^ ''))+4 ^ d(L{H^p))+6 1 ^ y/^g,^ 

p(H) 

The requirement on k stated in Proposition 14.21 is not necessarily best possible. The proof of 
the counting lemma will be by induction on the vertices of H, removing one vertex at a time. A 
better bound on k can sometimes be obtained by tracking the requirements on k at each step of 
the procedure, as explained in a tutorial in Section 14.51 



4.1 Doubling 

Doubling is a technique used to reduce the problem of counting embeddings of -ff in G to the 
problem of counting embeddings of H with one vertex deleted. 

If a S V{H), Hax2 is the graph H with vertex a doubled. In the assignment of vertices of Hax2 
to vertex subsets of F, the new vertex a' is assigned to the same vertex subset as a. Let Ha be the 
subgraph of H consisting of edges with a as an endpoint, and let Ha.ax2 be Ha with a doubled. Let 
H_a be the subgraph of H consisting of edges not having a as an endpoint. We refer to Figured] 
for an illustration. 
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Lemma 4.3 (Doubling). Let H be a graph with vertex set {1, . . . ,m}. Let T be a weighted graph 
with vertex subsets Xi, . . . ,Xm, and let G be a weighted subgraph ofT. For each edge be of H, we 
have numbers < qbc ^ Pbc ^ 1- Let g be a function that agrees with G on Xi x Xj whenever 
a £ and with T on Xi x Xj whenever a ^ Then 

\G{H)-q{H)\ 

< q{Ha) \G{H^a) - q{H^a)\ + GiH^afl^ {g{Hax2) " 2q{Ha)g{H) + q{Ha)''T{H^a)f^ . (7) 

Proof. Let y vary uniformly over compatible maps V{H) \ {a} — )• V{G) where y(6) G X}, for each 
b G V{H) \ {a}. We have 

G{H) - q{H) = q{Ha){G{H^a) - q{H^a)) + [ {G {Ha \ y) - G {H^a \ y) dy. 

It remains to bound the integral, which we can do using the Cauchy-Schwarz inequality. 
^ (G {Ha I y) - q{Ha)) G {H.a I y) dy) 

< ^ G {H^a I y) dy^ ^ {G {Ha \ y) - q{Ha)f G {H^a I y) dy^ 
= G{H^a) [ {G {Ha I y) - q{Ha) f G {H^a I y) dy 

< G{H^a) [ {G {Ha I y) - q{Ha)f r {H^a I y) dy 



□ 



G{H^a) [9{Ha^2) - 2q{Ha)g{H) + q{HaYT{H.a)) . 



Using Lemma [3^ we know that under appropriate hypotheses, we have 

9{Ha^2f''''^^"\{H-a)G{Ha,a^2), 
c 

g{H)''^i\{H^a)G{Ha) 

c 

and T{H.ar^ p{H.a). 

c 

If we can show that 

G{Ha,ax2) ~ Q{Ha,ax2) 

c,t 

and G{Ha) « ' q{Ha), 

then the rightmost term in ([7]) is ~c,e^^ 0, which would reduce the problem to showing that 
(?(-ff-a)- This reduction step is spelled out below. See Figure for an illustra- 
tion. 
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G{H) 

it 





G{H^a) G{Ha,ax2) G{Ha 

Figure 5: The doubling reduction. 



Lemma 4.4. Assume Setup\4^ Let a G V{H). Suppose that k > — — . Suppose that 



Then 



c,e c,e c,e 



p(H) 
c,e 

Remark. We do not always need the full strength of Setup 14.11 (although it is convenient to state 
it as such). For example, when H \s a. triangle with vertices {1,2,3}, H-i is a single edge, so we 
do not need discrepancy on (X2, Xa)^' to obtain G{H^a) ~c,e '723- In particular, our approach gives 
the triangle counting lemma in the form stated in Kohayakawa et al. [59], where discrepancy is 
assumed for only two of the three pairs of vertex subsets of G. 



4.2 Densification 

Densification is the technique that allows us to transform a subdivided edge of H into a single 
dense edge, as summarized in the figure below. This section also contains a counting lemma for 
trees (Proposition 14. 9p . 




We introduce the following notation for the density analogues of degree and codegree. If F (and 
similarly G) is a weighted graph with vertex subsets X, Y, Z, then for x G X and z G Z, we write 

G{x,Y)= [ G{x,y) dy, 

JyeY 

and G{x,Y,z) = / G{x,y)G{y, z) dz. 
JyeY 

Now we state the goal of this section. 
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Lemma 4.5 (Densification). Assume Setup with k > 2 — ■ li2,3 be vertices in H 
such that 1 and 3 are the only neighbors of 2 in H, and 13 ^ E{H). Replace the induced bipartite 
graph (XijXs)^ by the weighted bipartite graph defined by 

G{xi,x-i) = mill {G(xi,X2,X3), 2^12^23} • 

2pi2P23 

Let H' denote the graph obtained from H by deleting edges 12 and 23 and adding edge 13. Let 
qi3 = andpu = 1. Then {Xi,X3)g satisfies DISC(gi3, 1, 2e + 18c) and 

\G{H) - 2puP23G{H')\ < ((1 + c)'=(^^') - 1 + 26c^)p{H). 

Note that q{H) = 2pi2P22,q{H'). So we obtain the following reduction step as a corohary. 

Corollary 4.6. Continuing with Lemma [731 If G{H') 7=4f\(H') then G{H) ^Pc':P q{H) in the 
original graph. 

The proof of Lemma 14.51 consists of the following steps: 

1. Show that the weighted graph on Xi x X^ with weights G{xi,X2,X3) satisfies discrepancy. 

2. Show that the capping of weights has negligible effect on discrepancy. 

3. Show that the capping of weights has negligible effect on the -ff-count. 

Steps 2 and 3 are done by bounding the contribution from pairs of vertices in Xi x X^ which 
have too high co-degree with X2 in T. 

We shall focus on the more difficult case when both edges 12 and 23 are sparse. The case when 
at least one of the two edges is dense is analogous and much easier. Let us start with a warm-up 
by showing how to do step 1 for the latter dense case. We shall omit the rest of the details in this 
case. 

Lemma 4.7. Let < < < 1, < ^2 ^ 1> e > 0. Let G be a weighted graph with vertex subsets 
X,Y,Z, such that {X,Y)g satisfies DISC(gi,pi, e) and {Y,Z)g satisfies DlSC{q2,l,e). Then the 
graph G' on {X,Z) defined by G'{x,z) = G{x,Y,z) satisfies DlSC{qiq2,Pi,2e). 

Proof. Let u: X ^ [0, 1] and vu: Z ^ [0,1] be arbitrary functions. In the following integrals, let 
X, y and z vary uniformly over X, Y and Z, respectively. We have 

u{x){G{x,Y, z) — qiq2)w{z) dxdz 

u{x){G{x,y)G{y, z) - qiq2)w{z) dxdydz 

x,y,z 

u{x){G{x,y) — qi)G{y, z)uj{z) dxdydz + qi / u{x){G{y, z) — q2)w{z) dxdydz. (8) 

x,y,z J x,y,z 

Each of the two integrals in the last sum is bounded by epi in absolute value by the discrepancy 
hypotheses. Therefore {X,Z)g' satisfies DISC (51^2 j Pi j 2e). □ 

The next lemma is step 1 for the sparse case. 

Lemma 4.8. Let c,p,€ S (0,1] and qi,q2 G [0,p]. Let T be a graph with vertex subsets X,Y,Z, 
and G a weighted subgraph of G. Suppose that 
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{X,Y)r is {p,cp^^'''^^y\X\ \Y\)-jum,bled and {X,Y)g satisfies DISC(gi,p, e); and 



• {Y,Z)y is {p,cp^^'^^/\Y\\Z\)- jumbled and {Y,Z)g satisfies DISC(g2,p, e). 

Then the graph G' on {X,Z) defined by G'{x,z) = G{x^Y,z) satisfies DISC(gi(72 5P^i 3e + 6c). 

Remark. By unraveling the proof of Lemma ESI we see that the exponent of p in the jumbledness 
of {X^Y)y can be relaxed from | to 1. 

Proof. We begin the proof the same way as Lemma 14.71 In ([8]), the second integral is bounded in 
absolute value by ep^. We need to do more work to bound the first integral. 
Define v : y ^ [0, 1] by 

v{y) = / G{y,z)w{z) dz. 



So the first integral in ([H]), the quantity we need to bound, equals 

u{x){G{x,y) - qi)v{y) dxdy. (9) 



If we apply discrepancy immediately, we get a bound of ep, which is not small enough, as we need 
a bound on the order of o(p^). The key observation is that v{y) is bounded above by 2p on most 
of Y . Indeed, let 

Y' = {yeY\T{y,Z)>2p]. 

By Lemma 13.71 we have \Y'\ < c^p|y|. Since v1y\y' is bounded above by 2p, we can apply 
discrepancy on {X, Y)g with the functions u and ^v1y\y' to obtain 



u{x){G{x,y) - qi)v{y)lY\Y: dxdy 

x,y 



< 2e/. 



In the following calculation, the first inequality follows from the triangle inequality; the second 
inequality follows from expanding v{y) and using G <T and u,w < 1; the third inequality follows 
from Lemma [3.8l (applied with u in the Lemma being the function ly/ on Y and 1 everywhere else, 
and H' the empty graph so that E' (H' | x) = 1 always). 

u{x){G{x,y) - qi)v{y)lY'{y) dxdy 

x,y 

< / u{x)G{x,y)v{y)lY' dxdy + / u{x)qiv{y)lY'{y) dxdy 

Jx,y J 



< ^{x,y)lY'{y)^{y,z) dxdydz + qi I lY'{y)T{y, z) dydz 

< ((1 + C)2 - 1 + p2 + + _ 1 + p 

< 6cp^. 

Therefore, ([9]) is at most (2e + 6c)p^ in absolute value. Recall that the second integral in ([8]) was 
bounded by ep^. The result follows from combining these two estimates. □ 

The technique used in Lemma 14.81 also allows us to count trees in G. 
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Proposition 4.9. Assume Setup with H a tree and k > ^ ^ ' — . Then 



p(H) 

In fact, it can be shown that the error has the form 

\G{H)-q{H)\ <MH{c + e)p{H) 

for some real number Mh > depending on H. 

To prove Proposition 14. 9^ we formulate a weighted version and induct on the number of edges. 
The weighted version is stated below. 

Lemma 4.10. Assume the same setup as in Proposition \4.9\ Let u: V{G) — )• [0, 1] be any function. 
Let X vary uniformly over all compatible maps V{H) — t- V{G). Write u{x) = Y\aeV{H) '^{^aj- Then, 



G{H I x)n(x) d^^^^\{H) I u(x) dx. 



To prove Lemma 14.101 we remove one leaf of -fT at a time and use the technique in the proof of 
Lemma 14.81 to transfer the weight of the leaf to its unique neighboring vertex and use Lemma 13.81 
to bound the contributions of the vertices with large degrees in F. We omit the details. 

Continuing with the proof of densification, the following estimate is needed for steps 2 and 3. 



Lemma 4.11. Let V be a graph with vertex subsets X,Y,Z, such that (X, y)r is {p, cpy^\X\ \Y\)- 
jumbled and (Y, Z)r is {p,cp^^'^^/\Y\ \Z\)-jumbled. Let 

E' = {{x,z) eX xZ \ r{x,Y,z) > 2p^] . 

Then \E'\ < 2Qc^ \X\ \Z\. 
Proof. Let 

X' = jx G X |r(x, y) - p| > II . 

Then, by Lemma 13.71 \X'\ < S(? \X\. For every x G X, let 

Z', = {z(^Z\T{x,Y,z)>2p']. 

For X ^ X \ X', we have, again by Lemma 13.71 that |Z^| < 18c^ \ Z\. The result follows by noting 
that E' C {X' X Z)U{{x,z) \ x £ X\X',z e Z'J. □ 

The following lemma is step 2 in the program. 

Lemma 4.12. Let c,e,p E (0,1] and qi,q2 £ [0,p]. Let T be a graph with vertex subsets X,Y,Z, 
and G a weighted subgraph of F . Suppose that 



• {X,Y)r is {p, cpy^\X\\Y\) -jumbled and {X,Y)g satisfies DISC(gi,p, e); and 

• (y,Z)r is (p,cp3/2^|y| \Z\)-jumbled and {Y,Z)g satisfies DlSCfe,^, e). 

Then the graph G' on (X, Z) defined by G'{x, z) = min {G(x, Y, z), 2p^} satisfies T)ISC{qiq2,p'^, 3e+ 
35c). 
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Proof. Let u: X — )• [0, 1] and w: Z ^ [0,1] be any functions. In the following integrals, x,y and z 
vary uniformly over X, Y and Z, respectively. We have 



{G'{x,z) — qiq2)u{x)'w{z) dxdz 



x,z 



{G{x,Y,z) — qiq2)u{x)w{z) dxdz— / {G{x,Y,z) — G' {x, z))u{x)w{z) dxdz. 



The first integral on the right-hand side can be bounded in absolute value by (3e + 6c)p^ by 
Lemma 14.81 For the second integral, let -E" = {(x, z) G X x Z | T{x,Y,z) > 2p^|. We have 

< / Y, z) - G'ix, z))u{x)w{z) dxdz 

J x,z 



< / G{x,Y, z)1e'{x, z) dxdz 

J x.z 

< / 'rix,y)T{y,z)lE'{y,z) dxdz 

J x,y,z 

< 29cp^ 

by Lemmas 13.81 and 14.111 The result follows by combining the estimates. □ 

Finally we prove step 3 in the program, thereby completing the proof of densification. 

Proof of Lemma \4.5\ We prove the result when both edges 12 and 23 are sparse. When at least 
one of 12 and 23 is dense, the proof is analogous and easier. 

Lemma [4.121 implies that {Xi,X^)q satisfies DISC((7i3, |, 3e+35c), and hence it must also satisfy 
DISC(gi3,l,2e + 18c). 

Let E' = {(xi,X3) I r(xi,X2,X3) > 2^12^23}- We have \E'\ < 26c^ \Xi\ IX3I by LemmaEH In 
the following integrals, let x: V{H) — )• V(T) vary uniformly over all compatible maps. Then 

< GiH) - 2puP23GiH') < J TiH\jc)lE'{xi,X2) dx<{{l + cY^^^"^ - 1 + 2Qc^)p{H) 

by Lemma 13.81 □ 

4.3 Counting C4 

With the tools of doubling and densification, we are now ready to count embeddings in G. We 
start by showing how to count C4, as it is an important yet tricky step. 



Proposition 4.13. Assume Setup with H = G4 and k>2. Then 

|G(C4) - (?(C4)| < 100(c + e)i/2p(C4). 

The constant 100 is unimportant. It can be obtained by unraveling the calculations. We omit 
the details. 

Proposition 14.131 follows from repeated applications of doubling (Lemma 14. 4p and densification 
(Corollary 14. 6p . The chain of implications is summarized in Figure [6] in the case when all four 
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A 



^ M A 

4 

Figure 6: The proof that GiCi) ^^c,e Q{Ci). The vertical arrows correspond to densification, doubhng 
the top vertex and densification, respectively. 

edges of C4 are sparse (the other cases are easier). In the figure, each graph represents a claim of 
the form G{H) ^^c^P q{H). The sparse and dense edges are distinguished by thickness. The claim 
for the dense triangle follows from the counting lemma for dense graphs (Proposition II. 8( ) and the 
claim for the rightmost graph follows from Lemma l4.7[ 



4.4 Finishing the proof of the counting lemma 

Given a graph H, we can use the doubling lemma. Lemma 14.41 to reduce the problem of counting 
H in G to the problem of counting H^a in G, where H^a is H with some vertex a deleted, provided 
we can also count Ha and Ha,ax2- Suppose a has degree t in H and degree t' in H^"^. The graph 
Ha is isomorphic to some Ki^f Since Ki^t is a tree, we can count copies using Proposition 14. 9| 
provided that the exponent of p in the jumbledness of T satisfies k > The following lemma 

shows that we can count embeddings of Ha.ax2 as well. 

Lemma 4.14. Assume Setup [7^7] where H = K2^t with vertices {oi, 02; 61, . . . , 64}. Assume that 
the edges aibj are sparse for 1 < j < t' and dense for j > t' . If k > then 

p(H) 

G{H)'Wq{H). 

Proof. When t' = 0, all edges of H are dense, so the result follows from the dense counting lemma. 
So assume t' > 1. First we apply densification as follows: 
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When t' = 1, we get a dense graph so we are done. Otherwise, the result follows by induction 
using doubling as shown below, where we use Propositions 14.131 and to count C4 and Ki^2, 
respectively. 




t 





Once we can count and ifa,ax2> we obtain the following reduction result via doubling. 
Lemma 4.15. Assume Setup\rn Let a he a veHex of H . If k > ^ ^ , then 



□ 



c.e 

The proof of the counting lemma follows once we keep track of the requirements on k. 



Proof of Proposition When H has no sparse edges, the result follows from the dense count- 
ing lemma (Proposition II. 8p . Otherwise, using Lemma 14.151 it remains to show that if A; > 



mm 



I A.{LiH^'^))+4: ^ d{L{H^))+Q | ^ ^j^gj^ there exists some vertex aoiH satisfying k > 



Actually, the hypothesis on k is strong enough that any a will do. Indeed, we have A(L{H^^)) + 2 > 
A{L{H^^2)) ^ '^i^{^Tax2' ^^a)) siuce doubling a increases the degree of every vertex by at most 
1. We also have d(L(i/^P)) > d{L{H^^^^2-> ^^-a)) ~ ^ since every edge in H^]^^ shares an endpoint 
with at most 4 edges in H^aax2- '-' 



4.5 Tutorial: determining jumbledness requirements 

The jumbledness requirements stated in our counting lemmas are often not the best that come out 
of our proofs. We had to make a tradeoff between strength and simplicity while formulating the 
results. In this section, we give a short tutorial on finding the jumbledness requirements needed for 
our counting lemma to work for any particular graph H. These fine-tuned bounds can be extracted 
from a careful examination of our proofs, with no new ideas introduced in this section. 

We work in a more general setting where we allow non-balanced jumbledness conditions between 
vertex subsets of F. This will arise naturallly in Section [6] when we prove a one-sided counting 
lemma. 

Setup 4.16. Let P be a graph with vertex subsets Xi,...,^^. Let p,c ^ (0,1]. Let be a 
graph with vertex set {1, ... , m}, with vertex a assigned to Xa- For every edge ab in H, one of the 
following two holds: 
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• {Xa,Xb)r is {p, cp'^"''y^|AaJ]A6[)-jumbled for some kab > 1, in which case we set pab = P and 
say that ab is a sparse edge, or 

• {Xa,Xb)r is a complete bipartite graph, in which case we set pab = 1 and say that ab is a 
dense edge. 

Let -ff'^P denote the subgraph of H consisting of sparse edges. 

Let e > 0. Let G be a weighted subgraph of T. For every edge ab G E{H), assume that 
{Xa,Xb)G satisfies DlSC{qab,Pab,e), where < g'ab < Pab- 

In the figures in this section, we label the edges by the lower bounds on kab that are sufficient 
for the two-sided counting lemma to hold. For instance, the figure below shows the jumbledness 
conditions that are sufficient for the triangle counting lemm80, namely kab ^ 3, kbc > 2, kac ^ |- 



c 




Although we are primarily interested in embeddings of H into G, we need to consider partial 
embeddings where some of the edges of H are allowed to embed into F. So we encounter three 
types of edges of H, summarized in Table [2j (Note that for dense edges ab, {Xa, Xb)r is a complete 
bipartite graph, so such embeddings are trivial and ab can be ignored.) 



Table 2: Types of edges in H. 



Figure Name 


Description 














-^^KAAA Jumbled edge 


An edge to be embedded 


in 


{Xa,Xb^ 


)r 


with Pab 


= p 


and kab ^ ^• 


Dense edge 


An edge to be embedded 


in 


{Xa, Xb^ 


)g 


with Pab 


= 1, 




— K — Sparse edge 


An edge to be embedded 


in 


{Xa,Xb^ 


)g 


with Pab 


= p 


and kab ^ 



Our counting lemma is proved through a number of reduction procedures. At each step, we 
transform H into one or more other graphs H' . At the end of the reduction procedure, we should 
arrive at a graph which only has dense edges. To determine the jumbledness conditions required to 
count some H, we perform these reduction steps and keep track of the requirements at each step. 
We explain how to do this for each reduction procedure. 



Removing a jumbled edge. To remove a jumbled edge ab from H, we need kab to be at least 
the average of the sparse degrees (i.e., counting both sparse and jumbled edges) at the endpoints 
of ab, i.e., kab > ^(degj|/sp(a) + degjjBp{b)). See Lemma [331 For example, kab ^ | is sufficient to 
remove the edge ab in the graph below. 




^As mentioned in the remark after Lemma [4.41 we do not actually need DISC on {Xa, Xi,)g, since edge density is 
enough. We do not dwell on this point in this section and instead focus on jumbledness requirements. 
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By removing jumbled edges one at a time, we can find conditions that are sufficient for counting 
embeddings into T (Proposition 13.31) . The fohowing figure shows how this is done for a 4-cycle. 



• ^3/2'^~^ •^3/2'"^ •w^,3/2wv• 

1 3/2 <^ 1 3/2 



Doubling The figure below illustrates doubling. If the jumbledness hypotheses are sufficient to 
count the two graphs on the right, then they are sufficient to count the original graph. The first 
graph is produced by deleting all edges with a as an endpoint, and the second graph is produced 
by doubling a and then, for all edges not adjacent to a, deleting the dense edges and converting 
sparse edges to jumbled ones. 




Densification To determine the jumbledness needed to perform densification, delete all dense 
edges, transform all sparse edges into jumbled edges, and use the earlier method to determine the 
jumbledness required to count embeddings into F. For example, the jumbledness on the left figure 
below shows the requirements on C4 needed to perform the densification step. It may be the case 
that even stronger hypotheses are needed to count the new graph (although for this example this 
is not the case). 



.—3/2—. 

1 3/2 <^ 




Trees To determine the jumbledness needed to count some tree H, delete all dense edges in H 
and transform all sparse edges into jumbled edges and use the earlier method, removing one leaf 
at a time to determine the jumbledness required to count embeddings into F (Proposition 14. 9( ). 

Example 4.17 (C4). Let us check that the labeling of C4 in the densification paragraph gives suf- 
ficient jumbledness to count C4. It remains to check that the jumbledness hypotheses are sufficient 
to count the triangle with a single edge. We can double the top vertex so that it remains to check 
the first graph below (the other graph produced from doubling is a single edge, which is trivial to 
count). We can remove the jumbled edge, and then perform densification to get a dense triangle, 
which we know how to count. 

Example 4.18 (K^). The following diagram illustrates the process of checking sufficient jumbled- 
ness hypotheses to count triangles (again, the first graph resulting from doubling is a single edge 
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and is thus omitted from the figure). The sufficiency for C4 follows from the previous example. 




Example 4.19 (i^2,t)' The following diagram shows sufficient jumbledness to count i^2,4- The 
same pattern holds for K2^t- The reduction procedure was given in the proof of Lemma l4.14i First 
we perform densification to the two leftmost edges, and then apply doubling to the remaining 
middle vertices in order from left to right. 




Example 4.20 (i^i,2,2)' The following diagram shows sufficient jumbledness to count -^'1,2,2- This 
example will be used in the next section on inheriting regularity. 




5 Inheriting regularity 

Regularity is inherited on large subsets, in the sense that if (X, Y)g satisfies DISC(g, 1, e), then for 

l-X" 1 1 I 

any U C X and V CY, the induced pair {U,V)g satisfies DISC(g, l,e') with e' = j^jjl^e. This 
is a trivial consequence of the definition of discrepancy, and the change in e comes from rescaling 
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the measures dx and dy after restricting the uniform distribution to a subset. The loss in e is a 
constant factor as long as |^ and j^- are bounded from below. So if G is a dense tripartite graph 
with vertex subsets X, y, Z, with each pair being dense and regular, then we expect that for most 
vertices z G Z, its neighborhoods Nx{z) and Ny{z) are large, and hence they induce regular pairs 
with only a constant factor loss in the discrepancy parameter e. 

The above argument does not hold in sparse pseudorandom graphs. It is still true that if 
{X^Y)g satisfies DISC(g,p, e) then for any [/ C X and F C y the induced pair {U,V)g satisfies 
DISC(g, l,e') with e' = j^jjyj e. However, in the tripartite setup from the previous paragraph, we 
expect most Nx{z) to have size on the order of p |X|. So the naive approach shows that most z ^ Z 
induce a bipartite graph satisfying DISC((?,p, e') where e' is on the order of ^. This is undesirable, 
as we do not want e to depend on p. 

It turns out that for most z ^ Z, the bipartite graph induced by the neighborhoods satisfies 
DISC(g,p, e') for some e' depending on e but not p. In this section we prove this fact using the 
counting lemma developed earlier in the paper. We recall the statement from the introduction. 

Proposition 11.131 For any a > 0, ^ > and e' > 0, there exists c > and e > of size at least 
polynomial in a,^,e' such that the following holds. 

Let p G (0, 1] and qxY,Qxz,QYZ £ Let T be a tripartite graph with vertex subsets X, Y 

and Z and G be a subgraph ofV. Suppose that 



(X, y)r is {p,cp^^J\X\ \Y\)-iumbled and {X,Y)c satisfies T)l'&C{qxY -.P-,^); CLnd 



{X,Z)r is {p,cp'^y^\X\ \ Z\)-jumbled and {X,Z)g satisfies DlSC{qxz,P,£),' ctnd 



• (y, Z)r is {p,cp^'\/\Y\ \Z\)-jumbled and {Y,Z)g satisfies DISC(gyz,p, e). 

Then at least (1 — ^) \Z\ vertices z £ Z have the property that \Nxiz)\ > {1 — S,)qxz \X\, \Ny{z)\ > 
(1 -C)qYz\y\, and {Nx{z),Ny{z))g satisfies DlSC{qxY,P,e')- 

The idea of the proof is to first show that a bound on the -fC2,2 count implies DISC and then to 
use the i^i,2,2 count to bound the 1^2,2 count between neighborhoods. 

We also state a version where only one side gets smaller. While the previous proposition is 
sufficient for embedding cliques, this second version will be needed for embedding general graphs 
H. 

Proposition 5.1. For any a > 0, ^ > and e' > 0, there exists c > and e > of size at least 
polynomial in a, ^, e' such that the following holds. 

Let p £ (0, 1] and qxY, qxz £ [q^PjP]- Let T be a tripartite graph with vertex subsets X, Y and 
Z and G be a subgraph ofV. Suppose that 

• (X, y)r is {p,cp^/'^ \/\JC\\Y\) -jumbled and (X, y)^ satisfies DISC(gxy ^P, e)/ and 

• (X, Z)r is {p,cp^^'^^/\X\\Z\)- jumbled and {X,Z)g satisfies DISC(gxz,P, e)- 

Then at least (1 — (,)\Z\ vertices z £ Z have the property that \Nx{z)\ > (1 — (,)qxz\X\ and 
{Nx{z),Y)g satisfies BISC{qxY,P,e'). 

5.1 C4 implies DISC 

From our counting lemma we already know that if G is a subgraph of a sufficiently jumbled graph 
with vertex subsets X and Y such that {X,Y)g satisfies DISC(g,p, e), then the number of -fC2,2 in 
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* 2 I 1 2 

G across X and Y is roughly q \X\ \Y\ . In this section, we show that the converse is true, that 
the i^2,2 count impHes discrepancy, even without any jumbledness hypotheses. 
In what fohows, for any function / : X x y ^ M, we write 



s t 



" Jxi,...,x,ex n n -f^^'' dxi--- dxsdyi ■ ■ ■ dyt. 
yi,-',yteY i=ij=i 

The following lemma shows that a bound on the "de-meaned" C4-count implies discrepancy. 

Lemma 5.2. Let G be a bipartite graph between vertex sets X and Y . Let < (7 < p < 1 and 
e > 0. Define f : X x Y ^ R by f{x,y) = G{x,y) - q. If f{K2,2) < eV then {X,Y)g satisfies 
BISC{q,p,e). 

Proof. Let u: X [0,1] and v: Y — > [0,1] be any functions. Applying the Cauchy-Schwarz 
inequality twice, we have 



/ / fix,y)u{x)v{y) dydx] < 
Jxex JyeY J 



xex \JyeY 



f{x,y)u{x)v{y) dy] dx 



X ^^^^'^ Uy Y ^^^^^^ 

f{x,y)v{y) dy^ dx\ 

/ f {x,y) fix, y')v{y)v{y') dydy'dx 
Jy,y'GY 

fix, y)fix, y')viy)viy') dx ) dydy' 



x&X . 



y,y'eY \JxeX 



= [ viyfviy'fd fix,y)fix,y')dx) dydy' 

Jy,y'eY \JxeX J 

< [ ( [ fix,y)fix,y') d:^ dydy' 

Jy,y'& \JxeX / 

fix, y)fix, y')fix', y)fix', y') dxdx'dydy' 



y,y'eY Jx,x'ex 

/(^2,2) 



/ 4 4 

< e p . 



Thus 



IxeX Jy£Y 

Hence iX,Y)G satisfies DISC(g,p, e). 



/ / iGix,y)-q)uix)viy) 

Jxex Jv&Y 



dydx 



< ep. 



□ 



Lemma 5.3. Let G be a bipartite graph between vertex sets X and Y . Let < g < p < 1 



and e > 0. Let U <^ X and V Q Y. Let fi 



M 
1^1 



and V 



fix,y) = iGix,y)-q)\vix)\viy). ///(Ka.s) < eV^'j^', then iU,V)G satisfies DISC((?,p, e). 



. Define f : X x Y R by 
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Proof. This lemma is equivalent to Lemma 15.21 after appropriate rescaling of the measures dx and 
dy. □ 

The above lemmas are sufficient for proving inheritance of regularity, so that the reader may 
now skip to the next subsection. The rest of this subsection contains a proof that an upper bound 
on the actual C4 count implies discrepancy, a result of independent interest which is discussed 
further in Section 19.21 on relative quasirandomness. 

Proposition 5.4. Let G be a bipartite graph between vertex sets X and Y . Let < g < 1 and e > 0. 
Suppose G{Ki^i) > (1 - e)q and 0(^2,2) < (1 + ^fq^, then {X,Y)g satisfies DISC(g, g, 46^/36). 

The hypotheses in Proposition 15.41 actually imply two-sided bounds on G{Ki^2)-, 
G{K2^i), and 0(^2,2), by the following lemma. 

Lemma 5.5. Let G be a bipartite graph between vertex sets X and Y and / : X x y — > M be any 
function. Then < f{Ki^2? < /(i^2,2). 

Proof. The result follows from two applications of the Cauchy-Schwarz inequality. 

f{K2,2)= [ [ f{x,y)f{x,y')f{x',y)f{x',y') dxdx'dydy' 

= { f{x,y)fix,y')dx] dydy' 

Jy,y'£Y \Jx€X J 



> ( / / f{x,y)f{x,y') dx dydy' 

\Jy,y'eY Jx£X 



f{x,y) dy ) dx 
ixex \JyeY J J 

> ( / / /(a^^y) <i-y dx 

IxeX Jy&Y 



□ 



A bound on Ki^2 is a second moment bound on the degree distribution, so we can bound the 
number of vertices of low degree using Chebyshev's inequality, as done in the next lemma. Recall 
the notation G{x,S) = jy^y G{x,y)\s{y) dy for 5 C y as the normalized degree. 

Lemma 5.6. Let G be a bipartite graph between vertex sets X and Y . Let < (7 < 1 and e > 0. 
Suppose G(Ki,i) > {l-t)q andG{Ki^2) < {l + e)^q^. Let X' = {x £ X \ G{x,Y) < {l-2e^/^)q}. 
Then \X'\ < 2e^/^ IXI. 



Proof. We have 



^2e^/'^qy < [ {G{x,Y)-qf dx 

' Jx&X 



\X\ 



Ix&X 

= G{Ki^2)-2qG{Ki^i)+q^ 

< (l + e)V-2(l-e)g2+g2 

< 5eq^. 

Thus < |ei/3 □ 
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We write 

G (:^) = l^^^,^^G{x,y)G{x',y)G{x',y') dxdx'dydy'. 

y,y'& 

The next lemma proves a lower bound on G I ) by discarding vertices of low degree. 



Lemma 5.7. Let G he a bipartite graph between vertex sets X and Y . Let < (7 < 1 and 
e > 0. Suppose > (1 - e)q, G(i^i,2) < (1 + e)^q^ and G(i^2,i) < (1 + e?Q^- Then 

> (l-14eV9)g3. 

Proof. Let 

X' = |x G X G(x, y) < (1 - 2e^/^)g} . 

Let G' denote the subgraph of G where we remove all edges with an endpoint in X'. Then G'(i^2,i) < 
G{K2,i) < (1 + efq'^ and, by LemmaES] 



Let 



G'{K,,,) > ^^(1 - 2e'/')q > (1 - 2ei/3)2g > (1 _ 4ei/3) 



y = {y e y I G(X \ X', y) < (1 - 4e'/^)q} . 



So |y| < 4e^/^ by applying Lemma 15.61 again. Restricting to paths with vertices in X \ X' ,Y \ 
Y',X\X',Y, we find that 

G (IZ:) > ^ rmn,, G(X \ X', y)) ' mm , Y ) 

> (l-4ei/9)3(i_2ei/3)^3 

> (1 - 14e^/9)g^ 

□ 

The above argument can be modified to show that a bound on Ki^2 implies one-sided counting 
for trees. We state the generalization and omit the proof. 

Proposition 5.8. Let H be a tree on vertices {1, 2, . . . , m}. For every 9 > there exists e > of 
size polynomial in 9 so that the following holds. 

Let G be a weighted graph with vertex subsets Xi, . . . ,Xm. For every edge ah of H, assume 
there is some qab G [0,1] so that the bipartite graph {Xa,Xi,)G satisfies G{Ki^i) > (1 — e)qab, 
G(Ki,2) < (1 + e)^ql, and G(K2,i) < (1 + efql^^. Then G{H) > (1 - 9)q{H). 

Proof of Proposition \5.4\ Using Lemma [531 we have G{Ki^2) < (1 + ^(7^2,1) < (1 + 

and G{Ki^i) < (1 + €)q. Let f{x,y) = G{x,y) — q. Applying Lemma ISTfl we have 



f{K2,2) = G{K2,2) - 4.qG + 2q^G{Ki,if + 4q^G{Ki,2) - 4g=^G(iri,i) + / 

< (1 + efq^ - 4(1 - 14ei/9)/ + 2(1 + efq^ + 4(1 + efq^ - 4(1 - e)q^ + q^ 

Thus, by Lemma[521 (X, Y)g satisfies DISC(g, q, Ae^/^^). □ 
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5.2 ^1,2,2 implies inheritance 

We now prove Propositions 11.131 and 15.11 using Lemma 15. 3i 

Proof of Proposition \1.13[ First we show that only a small fraction of vertices in Z have very few 
neighbors in X and Y. Let Zi be the set of ah vertices in Z with fewer than (1 — i)qxz \X\ 
neighbors in X. Applying discrepancy to {X,Zi) yields ^qxz\Zi\ < ep\Z\. If we assume that 
e < gO!^^, we have \Zi\ < \Z\ < | \Z\. Similarly let Z2 be the set of all vertices in Z with 

fewer than (1 — C)1YZ \Y\ neighbors in y, so that IZ2I < | \Z\ as well. 

Define /: V{G) x V{G) — t- M to be a function which agrees with G on pairs (X, Z) and (y, Z), 
and agrees with G — qxY on {X,Y). Let us assign each vertex of ^^^2,2,1 to one of {X,Y,Z} as 
follows (two vertices are assigned to each of X and Y). 




The stated jumbledness hypotheses suffice for counting i^i,2,2 and its subgraphs; we refer to the 
tutorial in Section [4.51 for an explanation. 

By expanding all the {G{x,y) — qxy) factors and using our counting lemma, we get 



/ ('^) = G f - ^qxvG f + 2qlyG ( 



+ 




= 0. 

Therefore, by choosing e and c to be sufficiently small (but polynomial in ^, a, e'), we can guarantee 
that 

/f-®-)<ki-ovv/. 



Let i^2,2 denote the subgraph of the above i^i,2,2 that gets mapped between X and Y . For each 
z e Z, let X X y R be defined by {G{x,y) - qxY)'^Nx{z){x)^NY{z){y)- We have 



By Lemma 15.51 /z(-^2,2) ^ for all z ^ Z. Let Z3 be the set of vertices z va. Z such that 
/.(i^2,2) > e'"(l -6"«V- Then \Z^\ < | \Z\. 

Let Z' = Z \ (Zi U Z2 U Z3). So |Z'| > (1 - 1^1- Furthermore, for any z G Zi, 

/.(i^2,2) < e-(l - O^aV < e'^d - e)Vdz.?-z < e'V (^^)' (^)' ' 
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It follows by Lemma [5T3l that {Nx{z), Ny{z))g satisfies DISC(gxy,p, e'). 



□ 



Proof of Proposition \5.1[ The proof is essentially the same as the proof of Proposition 11.131 with 
the difference being that we now use the following graph. We omit the details. 



6 One-sided counting 

We are now in a position to prove Theorem 11.141 which we now recall. 

Theorem 11.141 For every fixed graph H on vertex set {1, 2, . . . , m} and every a,6 > 0, there exist 
constants c > and e > such that the following holds. 

Let T be a graph with vertex subsets Xi, . . . ,Xm and suppose that the bipartite graph (Xj,Xj)r 
is {p, cp'^'^^^^'^^ y^\Xi\ \Xj\) -jumbled for every i < j with ij £ E{H). Let G be a subgraph of T , 
with the vertex i of H assigned to the vertex subset Xi of G. For each edge ij of H , assume that 
{Xi,Xj)G satisfies DlSC{qij,p,e), where ap < qij < p. Then G{H) > (1 — 6)q{H). 

The idea is to embed vertices of H one at a time. At each step, the set of potential targets for 
each unembedded vertex shrinks, but we can choose our embedding so that it doesn't shrink too 
much and discrepancy is inherited. 

Proof. Suppose that vi,V2^ ■ ■ ■ ,Vm''^s an ordering of the vertices of H which yields the 2-degeneracy 
d2{H) and that the vertex Vi is to be embedded in X^. Let L{j) = {fi,f2, • • • iVj}. For i > j, let 
N{i,j) = N{vi) n L(j) be the set of neighbors Vh of Vi with h < j. Let q{j) = Y\qab^ where the 
product is taken over all edges VaVb of H with I < a < b < j and q{i,j) = Y\vh&N{ij) Ihi- Note that 

q{j) = qUJ - - !)• 

We need to define several constants. To begin, we let 6m = ^ and em = 1- Given 6j and 
ej, we define = and 6j-i = We apply Propositions 11.13] and [5TT] with a,^j and Cj 
to find constants Cj_i and e^_^ such that the conclusions of the two propositions hold. We let 

ej-i = min(e*_p ^), c = \a'^'^^^'> cq and e = eo. 

We will find many embeddings / : V{H) — )• V{G) by embedding the vertices of H one by one 
in increasing order. We will prove by induction on j that there are (1 — ^j)g(j)|Xi||X2| . . . 
choices for f{vi), /(t'2), • • • , fi^j) such that the following conditions hold. Here, for each i > j, we 
let T{i,j) be the set of vertices in Xi which are adjacent to f{vh) for every Vh G N{i,j). That is, 
it is the set of possible vertices into which, having embedded vi,V2, . . . ,Vj, we may embed Vi. 

• For 1 < a < 6 < j, {f{va), fivb)) is an edge of G if {va,Vb) is an edge of H; 



• For each ii,i2 > j with Vi-i^Vi^ an edge of H, the graph (T{ii, j),T{i2, j))G satisfies the 
discrepancy condition DISC (gab, e^). 




Z 



□ 



• \T{iJ)\ > (1 - -^)q{i,j)\Xi\ for every i > j; 
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The base case j = clearly holds by letting T(i,0) = Xj. We may therefore assume that 
there are (1 — Oj^i)q{j — l)|Xi||X2| . . . |Xj-i| embeddings of ui, . . . , Vj-i satisfying the conditions 
above. Let us fix such an embedding /. Our aim is to find a set W{j) C T{j,j — 1) with 

■ 

> (1 — — ^)\X-j\ such that for every w G W{j) the following three conditions hold. 

1. For each i > j with Vi G N{vj), there are at least (1 — -^)q{i,j)\Xi\ vertices in T{i,j — 1) 
which are adjacent to w\ 

2. For each ii, ^2 > j with Vi-^^Vi^, Vi^^vj and Vi.^Vj edges of H, the induced subgraph of G between 
N{w) nT(ii,j) and N{w) nT(i2,j) satisfies the discrepancy condition DISC{qab,P,^j)', 

3. For each ii,i2 > j with Vi-^Vi^ and Vi-^Vj edges of H and Vi^Vj not an edge of H, the induced 
subgraph of G between N{w) n T{ii,j) and T{i2,j — 1) satisfies the discrepancy condition 
BlSC{qab,P,ej). 

Note that once we have found such a set, we may take /(vj) = w for any w G W{j). By using 
the induction hypothesis to count the number of embeddings of the first j — 1 vertices, we see that 
there are at least 

(l - I) q{jJ - 1)1^,1(1 - Oj-i)q{j - miU2\ . . . \Xj^i\ > (1 - ^,)g(j)l^i||^2| . . . \Xj\ 

ways of embedding vi,V2, ■ ■ ■ ,Vj satisfying the necessary conditions. Here we used that q{j) = 

6 ■ 

q{j,j—l)q{j—l) and6'j_i = ^. The induction therefore follows by letting T(i, j) = N{w)nT{i,j—l) 
for all i > j with Vi G N{vj) and T{i,j) = T{i,j — 1) otherwise. 

It remains to show that there is a large subset W{j) of T{j,j — 1) satisfying the required 
conditions. For each i > j, let Ai{j) be the set of vertices in T{j,j — 1) for which \N{w) f] T{i,j — 

1)1 ^ (1 — '\^)Qij\T{i, j — 1)1- Then, since the graph between T(i,j — 1) and T{j,j — 1) satisfies 

- 

BlSC{qji,p,ej-i), we have that ej-ip\T{j,j - 1)| > ^qij\Ai{j)\. Hence, since qij > ap, 

\AU)\<^^\T{j,j-l)\. 
Note that for any w eT{j,j - 1)\ Ai{j), 

\N{w) n T{i,j - 1)1 > (l - ^) qv\T{iJ - 1)1 ^ (l - ^) «(^' 

For each ii,i2 > j with Vi^Vi^, Vi^Vj and vi^Vj edges of H, let Bi^^i^{j) be the set of vertices w 
in T(j, j — 1) for which the graph between N{w) fl T{ii,j — 1) and N{w) fl T{i2,j — 1) does not 
satisfy DlSC{qij^i2,p,ej). Note that 



\T{h,j - l)\\T{i2,j - 1)1 > ( 1 - ) q{ii,j - l)q{i2,j - l)\Xi,\\Xi,\ 



^2d2{H)-2 

where we get 2(i2(^) — 2 because J is a neighbor of both ?! andi2withj < 11^12- Similarly, |T(ii,j — 
l)\\T{j,j - 1)1 and |T(Z2, j - l)||T(j, j - 1)| are at least «!^p2d,(//)|^^j^ 
respectively. 
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Since 



< copV|r(ii,i-l)||r(i2,i-l)|, 

the induced subgraphofF between T(ii,j—1) andT(i2, j— 1) is {p,cop^y/\T{ii,j — l)||T(i2,J — 1)|)- 
jumbled. Similarly, the induced subgraph of F between the sets T{j,j — 1) and T{ii,j — 1) 
is {p,coP^\/\T{j,j — l)\\T{ii,j — l)|)-jumbled and the induced subgraph between T{j,j — 1) and 
T{i2,j — 1) is {p,cop^y^\T{j,j — l)||T(i2,i — l)|)-jumbled. By our choice of ej_i, we may therefore 
apply Prop osition 1 1 . 1 3 1 to show that \Bi-^^i^{j)\ < (,j\T{j,j — 1)|. 

For each ii,i2 > j with Vi-^^Vi^ and Vi-^vj edges of H and Vi^vj not an edge of H, let Ci^^i^{j) be 
the set of vertices w in T{j,j — 1) for which the graph between N{w) r\T{ii,j — 1) and T{i2,j — 1) 
does not satisfy DlSC{qij^i^,p, ej). As with Bi^^i^[j), we may apply Proposition 15.11 to conclude that 

<6-imj-i)|. 

- 

Counting over all possible bad events and using that |T(j, j — 1)| > (1 — ^-^)Q{j-,3 — we 
see that the set W{j) of good vertices has size at least (1 — cr)q{j,j — l)\Xj\, where 

- 6 aOj V 2 y 12 6 6 - 2 ' 

as required. Here we used 6j = ej-i < and = This completes the proof. □ 

Note that for the clique Kt, we have d2{Kt) + 2> = t + 1. In this case, it is better to use the 
bound coming from two-sided counting, which gives the exponent t. 

Another case of interest is when the graph H is triangle-free. Here it is sufficient to always 
apply the simpler inheritance theorem, Proposition 15. H to maintain discrepancy. Then, since 

we see that an exponent of d2{H) -|- 2 is sufficient in this case. In particular, for H = Kg^t-, we get 
an exponent of d2{Ks^t) + 2 = ^ + 2 = as quoted in Table [H 

It is also worth noting that a one-sided counting lemma for F holds under the slightly weaker 
assumption that (3 < cp'^^^^^^^n. We omit the details since the proof is a simpler version of the 
previous one, without the necessity for tracking inheritance of discrepancy. 

Proposition 6.1. For every fixed graph H on vertex set {1,2,... ,m} and every > there 
exist constants c > and e > such that the following holds. 

Let T he a graph with vertex subsets Xi, . . . ,Xm where vertex i of H is assigned to the vertex 
subset Xi of T and suppose that the bipartite graph (Xj,Xj)r is {p, cp'^^^^'^'^^ y^\Xi\ \Xj\) -jumbled 
for every i < j with ij G E{H). Then T{H) > (1 — 6)p{H). 

7 Counting cycles 

Using the tools of doubling and densification, we already know how to count all cycles. For cycles 
of length 4 or greater, (p, cp^n)-jumbledness suffices. 

Proposition 7.1. Assume Setup [7^7] with H = Ce and k>3if£ = 3ork>2if£>4. Then 
G(QXf^)g(Q). 
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Proof. When £ = 4, see Proposition 14.131 When £ = 3, see Section [2] for the doubling procedure. 
For i > 5, we can perform densification to reduce the problem to counting C^-i with at least one 
dense edge, so we proceed by induction. □ 




The goal of this section is to prove a one-sided counting lemma for cycles that requires much 
weaker jumbledness. 

Proposition 7.2. Assume Setup with H = C^, where i > 5, and all edges sparse. Let k > 
1 + if i is odd and l + j^ if£ is even. Then G{Ce) > q{Ce)-9p{Ce) with 6 < 100{e'^/^^^H£c^/'^). 

The strategy is via subdivision densification, as outlined in Section [2j 



7.1 Subdivision densification 

In Section 14.21 we showed how to reduce a counting problem by transforming a singly subdivided 
edge of H into a dense edge. In this section, we show how to transform a multiply subdivided 
edge of H into a dense edge, using much weaker hypotheses on jumbledness, at least for one-sided 
counting. The idea is that a long subdivision allows more room for mixing, and thus requires less 
jumbledness at each step. 




We introduce a weaker variant of discrepancy for one-sided counting. 

Definition 7.3. Let G be a graph with vertex subsets X and Y. We say that {X, Y)g satisfies 
B1SC> {q,p,e) if 

/ (G{x,y) - q)u{x)v{y) dxdy > -ep (10) 

for all functions u: X — )■ [0, 1] and all u: y — t- [0, 1]. 

In a graph i?, we say that aoaia2 • • • am is a subdivided edge if the neighborhood of aj in H is 
{oj-i, aj+i} for 1 < i < m — 1. Say that it is sparse if every edge Ojaj+i, < i < m — 1, is sparse. 

For a graph F or G with vertex subsets Xo,Xi, . . . ^ Xq, Xm € Xm and X[ C Xj, we 

write 

G{xq,X[,X2, . . ■,X'^) = yi,^gXi G{xo,Xi)lx[{xi) ■ ■ ■ G{Xm-l,Xm)lxi^iXm) dxidx2 ■ ■ ■ dXm, 

ex 

G{xq,X[,X2, . . .,Xm) = j x^eXi G{xo,Xi)lx[{xi) ■ ■ ■ G{Xm^i,Xm) dxidx2 ■ ■ ■ dXm-1- 

^ m — 1 € -^m — 1 

These quantities can be interpreted probabilistically. The first expression is the probability that 
a randomly chosen sequence of vertices with one endpoint fixed is a path in G with the vertices 
landing in the chosen subsets. For the second expression, both endpoints are fixed. 
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Lemma 7.4 (Subdivision densification) . Assume Setup Let i > 2 and let oofli • • • be a 

sparse subdivided edge and assume that a^ai is not an edge of H. Assume k > 1+ 2F^- Replace 
the induced bipartite graph {Xa^ , ) by the weighted bipartite graph given by 

G{xo,Xi) = ^ mm^G{xo,Xi,X2, . . . , x<?), 4/| 

for {xo,xe) G Xq x Xg. Let H' be H with the path apai • • • deleted and the edge aoa^ added. 
Letpaoai = 1 and qaoae = -^qaoaiQaiai ■ ■■Qae_-,ar Then G{H) > 4p^G{H') and {Xo,Xi)G satisfies 
DISC>(g,o,„ 1, 18(61/(2^) +^c2/3)). 

Remark. If there is at least one dense edge in the subdivision, then using arguments similar to the 
ones in Section 14.2^ modified for one-sided counting, we can show that k > 1 suffices for subdivision 
densification. 

The idea of the proof is very similar to densification in Section HT2l The claim G{H) > Ap^G{H') 
follows easily from the new edge weights. It remains to show that (XQ,Xi)G satisfies DISC>. So 
Lemma 17.41 follows from the next result. 

Lemma 7.5. Let m>2, c,e,pG (0, 1], and qi,q2, . . . ,qm £ [0,p]- Let T be any weighted graph with 
vertex subsets Xq,Xi, . . . , Xm and let G be a subgraph of T. Suppose that, for each i = 1, . . . ,m, 
{Xi-i, X-i)r is {p, cp"'"'^2m-2 y^|Xj_i| \Xi\) -jumblcd and {Xi_i, Xi)Q satisfies DISC>(5j,p, e). Then 
the weighted graph G' on {Xq,X„i) defined by 

G'{xo,Xm) = min{G(xo,Xi,X2, . . . , X^.i, x^), 4p™} 

satisfies DISC>(gig2 • • • gm,P™, 72(ei/(2™) + mc^/^)). 

Here are the steps for the proof of Lemma 17.51 

1. Show that the graph on Xq x X^ with weights G(xo, Xi,X2, . . . , Xm-i,Xm) satisfies DISC>. 

2. Under the assumption that every vertex Xi has roughly the same number of neighbors in 
Xj+i for every i, show that capping of the edge weights has negligible effect on discrepancy. 

3. Show that we can delete a small subset from each vertex subset Xi so that the assumption 
in step 2 is satisfied. 

Step 2 is the most difficult. Since we are only proving lower bound discrepancy, it is okay to delete 
vertices in step 3. This is also the reason why this proof, without significant modification, cannot 
prove two-sided discrepancy (which may require stronger hypotheses), as we may have deleted too 
many edges in the process. Also, unlike the densification in Section 14.21 we do not have to worry 
about the effect of the edge weight capping on the overall //-count, as we are content with a lower 
bound. 

The next two lemmas form step 1 of the program. 

Lemma 7.6. Let G be a weighted graph with vertex subsets X,Y,Z. Let pi,p2,e G (0,1] and 
qi £ [0,pi], q2 e [0,p2]- If {X,Y)g satisfies DISC>(qi,pi, e) and {Y,Z)g satisfies DlSC>{q2,P2,e), 
then the induced weighted bipartite graph G' on {X, Z) whose weight is given by 

G'{x,z) = G{x,Y,z) 

satisfies DISC>(q'iQ'2,PiP2, Si/e). 
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Note that no jumbledness hypothesis is needed for the lemma. 
Proof. Let u: X ^ [0,1] and w: Z ^ [0,1] be arbitrary functions. Let 

y = |y G y 

Then applying (jlOp to u and ly yields < -v/e|y|. Similarly, let 

Y" = |y G y 
Then < ^|y| as weh. So 



{G{x,y) - qi)u{x) dx < -Vepi } . 



/ iG{y, z) - q2)w{z) dx < - Vep2 \ ■ 

Jz€Z ) 



xex 



G' {x, z)u{x)w{z) dxdz = Lfzxu{x)G{x,y)G{y,z)w{z) dxdydz 



> / G{x,y)u{x) dx] [ / G{y,z)w{z) dz] dy 

JyeY\{Y'UY") \JxeX / \JzeZ J 

> / {qiEu - ^/epl){q2'Ew - \^P2) dy 

JyeY\{Y'UY") 

> (1 - 2^e){qiEu - ^epi)iq2Ew - ^ep2) 

> qiq2E,uEw — &y/epiP2- 

□ 

The above proof can be extended to prove a one-sided counting lemma for trees without any 
jumbledness hypotheses. We omit the details. 

Proposition 7.7. Let H he a tree on vertices {1, 2, . . . , m}. For every 9 > 0, there exists e > of 
size at least polynomial in 9 such that the following holds. 

Let G be a weighted graph with vertex subsets Xi, . . . ,Xm- For each edge ab of H , suppose that 
{Xa,Xi,)G satisfies Dl^C>{qab,Pab,<^) for some < qab < Pab < '^^ Then G{H) > q{I{) - 9p{H). 

By Lemma 17.61 and induction, we obtain the following lemma about counting paths in G. 

Lemma 7.8. Let G be a weighted graph with vertex subsets Xq,Xi, . . . ,Xm- Let < e < 1. 
Suppose that for each i = 1,2, ... ,m, [Xi^i,Xi)Q satisfies DISC>((/i,pj, e) for some numbers < 
qi ^ Pi ^ 1- Then the induced weighted bipartite graph G' on Xq x Xm whose edge weights are 
given by 

G'{xo,Xm) = G{xo,Xi,X2, ...,Xm — 1 1 ) 

satisfies DISC>(g'iq'2 • • • qm,PiP2 ■ ■ ■ Pm, 

Proof. Applying Lemma [7^ we see that the auxiliary weighted graphs on {Xq,X2), {X2,X/i), . . . 
satisfy DISC>(gi(;2 5PiP2; 36e^/^), etc. Applying Lemma 17.61 again, we find that the auxiliary 
weighted graph on (Xq, X4), (X4, Xg) satisfy DISC>(gig2Q'3Q'4)PiP2P3P4, 36e^/^), etc. Continuing, 

we find that {XQ,Xm)G' satisfies DISC>(gig2 • • • '?mjPiP2 • • -Pm, e') with e' = 36e^ (iog2»"+i) _ 
36ei/(2m), □ 

For step 2 of the proof, we need to assume some degree-regularity between the parts. We note 
that the order of X and Y is important in the following definition. 
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Definition 7.9. Let F be a weighted graph with vertex subsets X and Y . We say that {X, Y)-p is 
{p,£,,rj) -bounded if |r(a;,y) — p\ < £,p for all x G X and T{x,y) < f] for all x G X and y GY. 

Here is the idea of the proof. Fix a vertex xq G Xq, and consider its successive neighborhoods 
in Xi, X2, . . . . Let us keep track of the number of paths from xq to each endpoint. We expect the 
number of paths to be somewhat evenly distributed among vertices in the successive neighborhoods 
and, therefore, we do not expect many vertices in Xi to have disproportionately many paths to xq- 
In particular, capping the weights of F(xo, Xi, . . . , Xm-i,Xm) has a negligible effect. 

Here is a back-of-the-envelope calculation. Suppose every pair (Xi, Xi+i)r is (p, 7y^|Xj| iXj+il)- 
jumbled. First we remove a small fraction of vertices from each vertex subset Xi so in the remaining 
graph F is bounded, i.e., every vertex has roughly the expected number of neighbors in the next 
vertex subset. Let S C Xi, and let N{S) be its neighborhood in Xj+i. Then the number of edges 
e{S,N{S)) between S and N{S) is roughly p\S\ \Xi^i\ by the degree assumptions on Xi. On the 
other hand, by jumbledness, e{S,Ni+i{S)) < -rV\^i\ l-^l \N{S)\ +p\S\ \N{S)\. When S is 
small, the first term dominates, and by comparing the two estimates we get that j^^^^^j is at least 

roughly p'^7~'^tx~\- Now fix a vertex xq G Xq. It has about p|Xi| neighbors in Xi. At each step, 
the fraction of Xi occupied by the successive neighborhood of xq expands by a factor of about 
p^7~^, until the successive neighborhood saturates some Xi. Note that for 7 = cp^'^ 2m~2 ^ we have 
^(p^T"^)™'"^ » 1, so the successive neighborhood of xq in X^ is essentially all of Xm- So we can 
expect the resulting weighted graph to be dense. 

We will use induction. We show that from a fixed xq G Xq, if we can bound the number of 
paths to each vertex in Xi, then we can do so for Xj+i as well. 

The next result is the key technical lemma. It is an induction step for the lemma that follows. 
One should think of X,Y and Z as XQ,Xi and Xj+i, respectively. 

Lemma 7.10. Let pi,p2, 5,1,5,2, ^.z G (0,1], and ??i,72 > 0. Let T be a weighted graph with vertex 
subsets X,Y,Z. Assume that {X,Y)y is {pi,5i,rii) -bounded and (Y, Z)r is {p2, 5,2 -bounded and 
iP2,72\/\Y\ \Z\) -jumbled. Let r( = max |47|p2 ^^g^^ryi, 4pip2} and 5' = 5i + + 2^3- Then the 
weighted graph F' on (X, Z) given by 

F'(x, z) = min {F(x, Y, z), 77'} 

is {pip2, 5' ,11') -bounded. 

Proof. We have F'(x,2;) < r]' for all x G X, 2: G Z. Also, by the boundedness assumptions, we 
have F'(x,Z) < r{x,Y,Z) < (1 + ■?i)(l + ^2)^1^2 < (1 + 5')piP2- It only remains to prove that 
F'(x, Z) > (1 - C)PiP2 for all x G X. 
Fix any x G X. Let 

Z'^ = {ze Z\ T{x,Y,z) > T]'} . 

Note that F'(x, Z) > F(x, Y, Z)-T{x, Y, Z'^), so we would like to find an upper bound for F(x, Y, Z'^). 

Apply the jumbledness criterion ([3]) to (Y, Z)r with the functions u{y) = F(x, y)ri^^ and v{z) = 
Iz' ■ Note that < it < 1 due to boundedness. We have 



/ \Z' \ / \Z' \ 

r(x, y)r]^^{r{y, z) - P2)lz^ (z) dydz < 72W F(x, ^)^r^T^ < 72 W (1 + 6)Pi^r^T^ 



The integral equals r/^ ^ (F(x, Y, Z'^) — p2F(x, ) , so we have 



F(x,y,z^) -P2F(x,y)^ < 72 J(i + (11) 
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On the other hand, we have 



r{x,Y,Z',)-p,T{x,Y)\^ > rj'\^ - (1 + ^1)^1^2^ > (12) 



Combining 1^ with pT]) . we get 



\Z\ - ?7'2 
Substituting ([T3|) back into pT]). we have 



ZL\ < 47|(l + gi)pir?i ^ ^^^^ 



r(x, Y, z;) < (1 + ei)piP2^ + 72W (1 + ^i)Pim-^' 



\z\ '^V^ ^^-^'^ |Z| 

" (47|P2^C3"Si)(4piP2) 472^2^^3^17/1 

= ^(1 + 6)^6?'iP2 + 2 + 6)6piP2 

< 2^3Pl?'2- 

Therefore, 

T'{x,Z)>r{x,Y,Z)-T{x,Y,Z',) > (1 - ei)(l - 6W2 - 26piP2 > (1 - OpiP2. 
This completes the proof that T' is (^1^2, C') ^O'bounded. □ 

By repeated apphcations of Lemma |7.10^ we obtain the following lemma for embedding paths 
in r. 

Lemma 7.11. Let < 4c2 < ^ < 4^ and < p < I. Let T he a graph with vertex sub- 
sets Xq, Xi, . . . , Xjyi- Suppose that, for each i = l,...,m, (Xj_i,Xj)p is {p,^,l) -bounded and 
{p,cp^~^^"^-^ Y^|Xj_i| \Xi\) -jumbled. Then the weighted bipartite graph T' on {XQ,Xm) defined by 

T'{xo,Xm) = min{r(xo,Xi,X2, . . . , Xm-i, Xm)Ap"'} 
is {p"^,4:m(^,4p"^) -bounded. 

Proof. Since r'(xo,X„) < r{xo, Xi, X2, . . . , Xm) < {I + O^^P"" < e"'^P"' < (1 + 4771^)^"^ for all 
2:0 G it remains to show that T' {xq, X^) > (1 — 4?n,^)p™ for all xq £ Xq. 

For every i = 1, . . . ,m, define a weighted graph F^*) on vertex sets Xq, Xj, Xj+i (with F^™) only 
defined on Xq and Xm) as follows. Set (Xj, Xi+i)p(i) = (Xi,Xj+i)r for each 1 < i < m — 1. Set 
(Xo,Xi)p(i) = {XQ,Xi)r and 

F(*+i)(xo,Xi+i) = min|FW(xo,Xi,Xi+i),?7i+i| 

for each 1 < i < ?n, — 1, where 



max 



(4c2r')^-V*"')^'+^),4p*} 



for every i. So T^^\xQ,Xi) < F(xo, ^1, . . . , ^i-i, Xj) for every i and every xq G XQ,Xi G Xi. 
Let 7 = cp^'''2'"-2 . Note that r/j+i = max |472p~i^~ir/j, 4p*'''i| for every i. So it follows by 
Lemma [7.101 and induction that {Xq, Xi)j^(t) is (p*, 4i^, r7j)-bounded for every i. Since r]m = 4p™, 
F'(xo, Xm) > F('")(xo, Xm) > (1 - 4m^)p™, as desired. □ 
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To complete step 2 of the proof, we show that the boundedness assumptions imply that the 
edge weight capping has negligible effect on discrepancy. 

Lemma 7.12. Let < 4c^ < ^ and < p < 1. Let T be a graph with vertex subsets Xq, Xi, . . . , X^ 
and let G be a subgraph of T. Suppose that, for each i = l,...,m, (Xj_i,Xj)r is (p, ^, 1)- 
bounded and {p,cp '^"^-^ y/\Xi-i\\Xi\)-jumbled and {Xi^i,Xi)G satisfies DlSC>(qi,pi,e). Then 
the weighted graph G' on {X^^Xm) defined by 

G'(xo,Xm) = min{G(xo,Xi,X2, . . . , x^), 4p™} 

satisfies DISC>(gig2 • • • gm,??™, 36e^/(2m) ^ 

Proof. We may assume that ? < 4^ since otherwise the claim is trivial as every graph satisfies 
DISC>((7,p, e) when e > 1. Let V be constructed as in Lemma 17.111 To simplify notation, let us 
write 

G{xo,Xm) = G{xq,Xi, ■ ■ ■ ,Xm-l,Xm) 

and r(xo,Xm) = r(xo,Xi, • • • ,Xm-i,Xm) 

for xo S Xm S Xm- We have 

G{xo,Xm) - G'{xo,Xm) = max{0, G{xo,Xm) - 4p™-} 

< max{0,r(xo,Xm) - 4p™} = r(xo,Xm) - r'(xo,Xm). 

Let q = qiq2 • • • q-m- For any functions ti : X — [0, 1] and w : y — )■ [0, 1], we have 

{G'{xQ,Xm) - q)u{xQ)v{Xm) dXQdXm> / ^ {G{xQ,Xjn) - q)u{xQ)v{Xjn) dxodXm 
XoGAo J XoGAo 

m 

(r(xo,Xm) - T'{xQ,Xm))u{xQ)v{Xm) dx^dXru- 

XoSAo 
Xm 

GX 

The first term is at least — 36e^/('^™^p™ by Lemma [7. 81 For the second term, we use the boundedness 
of r and F' to get 

/ (F(xo,Xm) - F'(xo,Xm))u(xo)f(x.m) (ixofixm, < / (F(xo, x^) - F'(xo, x^,) dxcdx^ 
J xoSXo J xoexo 

ex ex 

< (1 + - (1 - 4em)p'" 

< 8emp™. 

It follows that G' satisfies DISC>(g,p™, 36e^/(2m) _^ g^^^^,^ □ 

This completes step 2 of the program. Finally, we need to show that we have a large subgraph 
of F satisfying boundedness, so that we can apply Lemma [7.121 and then transfer the results back 
to the original graph. 

Lemma 7.13. Let < 5, 7,C;P < 1 satisfy 27^ < (5^^p^. Let T be a graph with vertex subsets 
Xq, Xi, . . . , Xm and suppose that, for each i = 1, . . . ,m, Xj)r is {p, (1 — 5)jy^\Xi-i\ \Xi\)- 



jumbled. Then we can find Xi C Xi with 



X,, 



> (1 — 5) for every i such that, for every 



< i < m — 1, the induced bipartite graph (Xj,Xj+i)r is {p,£,, l)-bounded and {p,j\l Xi Xi^i ) 
jumbled. 
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Proof. The jumbledness condition follows directly from the size of \Xi\, so it suffices to make the 



bipartite graphs bounded. Let Xm = Xm- For each i 



m 



, 0, in this order, set Xi 



to be the vertices in Xi with (1 it 



X, 



neighbors in Xj+i. So is (p, ^, l)-bounded. 



Lemma 13.71 gives us 



Xi \ Xi 



< \Xi\ < 6 \Xi 



□ 



Lemma 7.14. Let < g < p < 1 and e,6,6' > 0. Let G be a weighted bipartite graph with vertex 



sets X and Y . Let X (1 X and Y QY satisfy 



X 



> (1-6) \X\ and Y > {1 - 5) \Y\. Let G 



a weighted bipartite graph on {X ,Y) such that G{x,y) > (1 — 6')G{x,y) for all x £ X,y £ Y. If 
{X,Y)q satisfies DISC>(g,p, e), then {X,Y)g satisfies DlSC>{q,p,e + 26 + 6'). 

Proof. For this proof we use sums instead of integrals since the integrals corresponding to {X, Y)g 
and {X,Y)g have different normalizations and can be somewhat confusing. Let u: X — )• [0, 1] and 
v.Y^ [0,1]. We have 

xeXyeY xeXyeY 



>qil-S') 




><?(! 



> qu{X)v{Y) -{e + 2S + 5')p \X\ \Y\ 



ep\X\ \Y\ 



□ 



Proof of Lemma 7.5, We apply Lemma 17.131 to find large subsets of vertices for which the induced 
subgraph of F is bounded and then apply Lemma 17.121 to show that G restricted to this subgraph 
satisfies DISC>. Finally, we use Lemma 17.141 to pass the result back to the original graph. 

Here are the details. Let ^ = 8c^/^ and 6 = so that the hypotheses of Lemma 17.131 



are satisfied with 7 = 



Therefore, we can find Xi C Xi with 



X, 



each i so that (Xj,Xj+i)r is (p, ^, l)-bounded and (p, jr^p 



X,, 



X 



i+l 



> (1 - 5)Xi for 
jumbled for every 



< i < m — 1. Let G denote the graph G restricted to Xq, . . . , Xm- Note that the normalizations 
of G and G are different. For instance, for any S C Xi and any xq S Xq and X2 S X2, we write 



G(xo,5',X2) = 7^ X G'(xo,xi)G(xi,X2) 



1^1 1 ^ 



while 



G{xq,S, X2) 



So satisfies DISC>(gi,p, e') with e' < 



< 2e. Let G' denote the weighted bipartite 



graph on {Xo,Xm) given by 

G'{xo,Xm) 



mm 



{g{ 



Xo,Xi,. . . ,Xm-l 



,x^),4p™}. 
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Since 4(y^)^ < 8c^ < ^, we can apply Lemma 17.121 to G to find that (Xo,Xm)g, satisfies 
DISC>(gi ■■■qm,p'^, 72e^/(2™) + 8m^). To pass the result back to G' , we note that 

G'{xo,Xm) = min{G(xo,Xi, . . . , x™,), 4p"} 
> min|G(xo,Xi,...,Xm-i,Xm),4p™| 



mm 



^1 






1^1 







G{xo,Xi, . . . ,Xm-i,Xm),4p" 



> {l-5r-^G'{xo,Xm) 

> (1 - (m - l)5)G'{xo,Xm)- 

It follows by Lemma EH that {Xo,Xm)G' satisfies DISC>(gi • • • e') with e' < 72e^/(2m) _^ 

8mC + 25 + (m - < 72(ei/(2m) _^ ^g^/^). □ 

7.2 One-sided cycle counting 

If we can perform densification to reduce H to a triangle with two dense edges, then we have a 
counting lemma for H, as shown by the following lemma. Note that we do not even need any 
jumbledness assumptions on the remaining sparse edge. 



A 



Lemma 7.15. Let denote the triangle with vertex set {1,2,3}. Let G be a weighted graph 
with vertex subsets Xi,X2,X3 such that, for all i ^ j, {Xi,Xj)G satisfies DlSC>{qij,pij,e), where 
Pvi = ?'23 = 1; < pi2 < 1, and < qij < pij. Then G{K^) > qi2qi3q23 - 3epi2. 

Proof. We have 

G(i^3) - gi29i3923 = / (G(xi,X2) - gi2)G(xi , X3)G(x2, X3) dxidx2dx^ 

J Xl,X2,X3 

+ qi2 / (G(xi,X3) - g'i3)G(x2,X3) (iXidX2(iX3 + gi2Q'l3 / (G(X2,X3) - g'13) (ixidX2dX3. 

J Xl,X2,X3 J Xl,X2,X3 

The first integral can be bounded below by —epi2 and the latter two integrals by —eqi2- This gives 
the desired bound. □ 

The one-sided counting lemma can be proved by performing subdivision densification as shown 
below. 




A 



Proof of Proposition 7.2. Let the vertices of Ci be {1, 2, . . . ,£} in that order. Apply subdivision 
densification (Lemma l7.4p to the subdivided edge (1,2,..., [^/2] ), as well as to the subdivided edge 
{\i/2] , [£/2] + !,...,£). Conclude with LemmaOS □ 
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8 Applications 



It is now relatively straightforward to prove our sparse pseudorandom analogues of Turan's theorem, 
Ramsey's theorem and the graph removal lemma. All of the proofs have essentially the same flavour. 
We begin by applying the sparse regularity lemma for jumbled graphs, Theorem II. Ill We then 
apply the dense version of the theorem we are considering to the reduced graph to find a copy of 
our graph H. The counting lemma then implies that our original sparse graph must also contain 
many copies of H. 

In order to apply the counting lemma, we will always need to clean up our regular partition, 
removing all edges which are not contained in a dense regular pair. The following lemma is sufficient 
for our purposes. 

Lemma 8.1. For every e,a > and positive integer m, there exists c > and a positive integer 
M such that ifV is a {p,cpn) -jumbled graph on n vertices then any subgraph G of T is such that 
there is a subgraph G' of G with e(G') > e{G) — 4ae(r) and an equitable partition of the vertex set 
into k pieces Vi,V2, ■ ■ ■ ,Vk with m < k < M such that the following conditions hold. 

1. There are no edges of G' within Vi for any 1 < i < k. 

2. Every non-empty subgraph (Vi,V^)G/ has (iG'(^ii^) = ^ '^V o-nd satisfies DISC {qij,p,e). 

Proof. Let tuq = max(32a^^, m) and S = An application of Theorem 1 1.1 11 the sparse regularity 
lemma for jumbled graphs, using min{^,e} as the parameter e in the regularity lemma, tells us 
that there exists an rj > and a positive integer M such that if T is {p, r/|?n)-jumbled then there 
is an equitable partition of the vertices of G into k pieces with mo < k < M such that all but Ok'^ 
pairs of vertex subsets {Vi,Vj)G satisfy DISC {qij,p,e). Let c = min(7/, gjgr)- 

Since F is (p, /3) -jumbled with /3 < cpn, c < and n < 2M\Vi\ for all i, the number of edges 
between Vi and Vj satisfies 

\e{Vi,Vj) - p\Vi\\Vj\\ < cpn^ < '^p\Vi\\Vj\ 

and thus lies between ^p|Vi||Vj| and Note that this also holds for i = j, allowing for the 

fact that we will count all edges twice. 

Therefore, if we remove all edges contained entirely within any Vi, we remove atmost2pfc(^) = 

< jpn^ edges. Here we used that < [f] < ^ for all i. If we remove all edges contained 
within pairs which do not satisfy the discrepancy condition, the number of edges we are removing is 
at most 2p6k'^ (^) = SpOn"^ = ^pv? . Finally, if we remove all edges contained within pairs whose 
density is smaller than ap, we remove at most otp{^ < ^pn? edges. Overall, we have removed at 
most apn^ < 4ae(F) edges. We are left with a graph G' with e(G') > e{G) — 4ae(F) edges, as 
required. □ 

8.1 Erdos-Stone-Simonovits theorem 

We are now ready to prove the Erdos-Stone-Simonovits theorem in jumbled graphs. We first 
recall the statement. Recall that a graph F is {H,e)-Turdn if any subgraph of F with at least 

~ x{H)-i ~^ ^) edges contains a copy of H. 

Theorem 11.41 For every graph H and every e > 0, there exists c > such that if 13 < Qp<^2{H)+'i^ 
then any (p, 13) -jumbled graph on n vertices is {H,e)-Turdn. 
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Proof. Suppose that H has vertex set {1, 2, . . . , m}, F is a {p, /3)-jumbled graph on n vertices, where 
(3 < cp'^^^^^'^'^n, and G is a subgraph of F containing at least ^1 — ^^j^-^ -^ + e(r) edges. 

We will need to apply the one-sided counting lemma, Lemma 11.141 with a = | and 6. We 
get constants cq and eq > such that if F is {p,CQp^^^^^~^^^y]Xi]JX^)-iumhled and G satisfies 
DISC((7ij,p, eo), where ap < q-ij < p, between sets Xi and Xj for every 1 < i < j < m with 
ij G E{H), then G{H) > (1 - 9)q{H). 

Apply Lemma 18.11 with a = e/8 and eo- This yields constants ci and M such that if F is 
{p, cipn)-jumbled then there is a subgraph G' of G with 

where we used that a = |. Moreover, there is an equitable partition of the vertex set into k < M 
pieces Vi,...,Vk such that every non-empty subgraph {Vi,Vj)G' has d{Vi,Vj) = qij > ap and 
satisfies DISC(gjj,p, eo). 

We now consider the reduced graph i?, considering each piece Vi of the partition as a vertex 
Vi and placing an edge between Vi and Vj if and only if the graph between Vi and Vj is non- 
empty. Since F is (p, cpn) -jumbled and n < 2M|Vi|, the number of edges between any two pieces 
differs from p|Vi||V^| by at most cpv? < ^p\Vi\\Vj\ provided that c < gg|p-. Note, moreover, that 
\Vi\ < \j] < (1 + provided that n > 2M£. Therefore, the number of edges in the reduced 
graph R is at least 

,.p^:. <G') + ^ / 1 e\ fk 

(1 + To)p\-k^' ~ (1 + - V X{H) -l^4j\2 

where the final step follows from e(F) > (1 — ^)p(2)- 

Applying the Erdos-Stone-Simonovits theorem to the reduced graph implies that it contains 
a copy of H. But if this is the case then we have a collection of vertex subsets Xi , . . . , Xm such 
that, for every edge ij £ E{H), the induced subgraph {Xi,Xj)G' has d{Xi,Xj) = qij > ap and 
satisfies DISC(gjj,p, eo). By the counting lemma, provided c < we have G{H) > G'{H) > 
(1 — 6){apy^^\2M)~'"^^K Therefore, for c = min(^,ci, ggf^), we see that G contains a copy of 
H. □ 

The proof of the stability theorem. Theorem 11.51 is similar to the proof of Theorem 11.41 so 
we confine ourselves to a sketch. Suppose that F is a (p, /3)-jumbled graph on n vertices, where 
/3 < cp'^'^^^^^^n, and G is a subgraph of F containing ^1 — — e(F) edges. An application 

of Lemma 18.11 as in the proof above allows us to show that there is a subgraph G' of G formed 
by removing at most jpn^ edges and a regular partition of G' into k pieces such that the reduced 

graph has at least ^1 — — 26^ (2) edges. This graph can contain no copies oi H - otherwise 

the original graph would have many copies of H as in the last paragraph above. From the dense 
version of the stability theorem [82] it follows that if 5 is sufficiently small then we may make R 
into a {x{H) — l)-partite graph by removing at most j^k"^ edges. We imitate this removal process 
in the graph G' . That is, if we remove edges between Vi and Vj in R then we remove all of the 
edges between Vi and Vj in G' . Since the number of edges between Vi and Vj is at most 2p|yj||l^ |, 
we will remove at most 



1 2 



< -pn 
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edges in total from G' . Since we have already removed all edges which are contained within any Vi 
the resulting graph is clearly {x{H) — l)-partite. Moreover, the total number of edges removed is 
at most + ^pv? < epn?, as required. 

8.2 Ramsey's theorem 

In order to prove that the Ramsey property also holds in sparse jumbled graphs, we need the 
following lemma which says that we may remove a small proportion of edges from any sufficiently 
large clique and still maintain the Ramsey property. 

Lemma 8.2. For any graph H and any positive integer r > 2, there exist a,rj > such that if n is 
sufficiently large and G is any subgraph of Kn of density at least 1 — r], any r-coloring of the edges 
of G will contain at least an^^^^ monochromatic copies of H. 

Proof. Suppose first that the edges of have been r-colored. Ramsey's theorem together with a 
standard averaging argument tells us that for n sufficiently large there exists ao such that there are 
at least aori"^^^ monochromatic copies of H. Since G is formed from Kn by removing at most r]n? 
edges, this deletion process will force us to delete at most rjn'"^^^ copies of H. Therefore, provided 
that 7? < ^, the result follows with a = ^. □ 

We also need a slight variant of the sparse regularity lemma, Theorem II.IH which allows us to 
take a regular partition which works for more than one graph. 

Lemma 8.3. For every e > and integers i,mo > 1, there exist r] > and a positive integer M 
such that if T is a {p, rjpn) -jumbled graph on n vertices and Gi,G2, ■ ■ ■ Gi is a collection of weighted 
subgraphs ofT then there is an equitable partition into tuq < k < M pieces such that for each Gi, 
1 <i < £, all but at most ek"^ pairs of vertex subsets {Va,Vh)Gi satisfy DISC(g^*{),p, e) for some q^^ . 

There is also an appropriate analogue of Lemma 18.11 to go with this regularity lemma. 

Lemma 8.4. For every e, a > and positive integer m, there exist c > and a positive inte- 
ger M such that if T is a {p,cpn) -jumbled graph on n vertices then any collection of subgraphs 
Gi, G2, . . . ,Ge ofT will be such that there are subgraphs G[ of Gi with e{G[) > e{Gi) — 4ae(r) and 
an equitable partition of the vertex set into k pieces Vi,V2, ■ ■ ■ ,Vk with m < k < M such that the 
following conditions hold. 

1. There are no edges of G'- within Va for any 1 < i < I and any 1 < a < k. 

2. Every subgraph {Va,Vb)G'_ containing any edges from G[ has dQ'{Va,Vb) 
satisfies DISC(g^*(],p, e). 

The proof of the sparse analogue of Ramsey's theorem now follows along the lines of the proof 
of Theorem 11.41 above. 

Theorem 11.61 For every graph H and every positive integer r > 2, there exists c > such that if 
j3 < cp'^^^^^^^n then any {p, (3) -jumbled graph on n vertices is {H,r) -Ramsey. 

Proof. Suppose that H has vertex set {1,2,... , m}, T is a (p, /3)-jumbled graph on n vertices, where 
/? < cp'^^^^^'^^n, and Gi, G2, . . . , are subgraphs of T where Gi is the subgraph whose edges have 
been colored in color i. 

Let a,T] be the constants given by Lemma 18. 2i That is, for n > no, any subgraph of Kn of 
density at least 1 — ry is such that any r-coloring of its edges contains at least an^^^^ monochromatic 



= q„K > ap and 
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copies of H. We will need to apply the one-sided counting lemma, Theorem 11.141 with a = ^ 
and 9. We get constants cq and eo > such that if T is {p, cop'^'^^^^^^ y^\Xi\ |Xj|)-jumbled and G 
satisfies DlSC{qij,p, eo), where ap < q-ij < p, between sets Xi and Xj for every 1 < i < j < m with 
ij G E{H), then G{H) > (1 - e)q{H). 

We apply Lemma [8.41 to the collection Gi with a = eo and m = uq. This yields ci > 
and a positive integer M such that if T is (p, cipn)-jumbled then there is a collection of graphs 
G'^ such that e{G'^ > e{Gi) — 4ae(r) and every subgraph (14, Vf,)^' containing any edges from G'^ 

has dQi {Va,Vb) = q^^l > ap and satisfies DISC(g'^*j),|?, e). Adding over all r graphs, we will have 
removed at most 4rae(r) = §e(r) edges. Let G' be the union of the G\. This graph has density at 
least 1 — ^ in r. 

We now consider the colored reduced (multi)graph i?, considering each piece Va of the partition 
as a vertex Va and placing an edge of color i between Va and V}^ if the graph between Va and 
contains an edge of color i. Since T is (p, cpn)-jumbled and n < 2M\Vi\, the number of edges 
between any two pieces differs from p|Vi||y^| by at most cpn^ < ^p\Vi\\Vj \ provided that c < gQ^- 
Note, moreover, that \Vi\ < [^] < (1 + provided that n > Therefore, the number of 

edges in the reduced graph R is at least 

e{G') ^ (l-f)e(r) fk 
^-(l + l5)pm^-(l + l5)Mf)^-^ "^^12 

where the final step follows from e(r) > (1 — ^)p{2)- 

We now apply Lemma 18.21 to the reduced graph. Since k > m = uq, there exists a monochro- 
matic copy of H in the reduced graph, in color i, say. But if this is the case then we have a 
collection of vertex subsets Xi, . . . , Xm such that, for every edge ab G E{H), the induced sub- 
graph (Xa,Xf))Qi has dQi_{Xa, Xh) = q^^l > ap and satisfies DISC(5^*j),p, eo). By the counting 
lemma, provided c < we have G{H) > G'i{H) > (1 - 9){apY^^\2M)-'''^^l Therefore, for 
c = min(^, ci, ng ^, gglp')' that G contains a copy of H. □ 

8.3 Graph removal lemma 

We prove that the graph removal lemma also holds in sparse jumbled graphs. The proof is much 
the same as the proof for Turan's theorem, though we include it for completeness. 

Theorem 11.11 For every graph H and every e > 0, there exist 5 > and c > such that if 
(3 < cp'^^^^^'^^n then any {p, (3) -jumbled graph F on n vertices has the following property. Any 
subgraph ofT containing at most Sp'^^^^n^^^^ copies of H may be made H-free by removing at most 
epn^ edges. 

Proof. Suppose that H has vertex set {1, 2, . . . , m}, T is a (p, /3)-jumbled graph on n vertices, where 
j3 < cp'^'^^^^'^^n, and G is a subgraph of T containing at most Sp'^^^^n'"^^^ copies of H. 

We will need to apply the one-sided counting lemma. Lemma 11.141 with a = jq and 9 = ^. 
We get constants cq and cq > such that if T is (p, cqp'^^^^'^^^ y^\Xi\ |Xj|)-jumbled and G satisfies 
DISC(gjj,p, eo), where ap < qij < p, between sets Xi and Xj for every 1 < i < j < m with 
ij e E{H) then G{H) > \q{H). 

Apply Lemma 18.11 with a = e/16 and eo. This yields constants ci and M such that if T is 
[p, cipn)-jumbled then there is a subgraph G' of G with 

e(G') > e(G) - 4ae(r) > e(G) - |e(r) > e(G) - epn^, 
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where we used that a = j^. Moreover, there is an equitable partition into k < M pieces Vi, . . . ,Vk 
such that every non-empty subgraph {Vi, Vj)G' has diVi, Vj) = qij > ap and satisfies DlSC{qij,p, e^). 

Suppose now that there is a copy of H left in G'. If this is the case then we have a collection of 
vertex subsets Xi, . . . ,Xm such that, for every edge ij G E{H), the induced subgraph {Xi,Xj)Qi 
has dc'iXi, Xj) = Qij > ap and satisfies DlSC{qij,p,eo). By the counting lemma, provided c < 
we have G{H) > G'{H) > \{ap)<"\2M)-<"\ Therefore, for c = min(^,ci) and 6 = 
ia^(^) (2M)~'"*^^\ we see that G contains at least Sp'^(^)n^(^^ copies of H, contradicting our 
assumption about G. □ 



8.4 Removal lemma for groups 

We recall the following removal lemma for groups. Its proof is a straightforward adaption of the 
proof of the dense version given by Krai, Serra and Vena [63] . 

For the rest of this section, let = 3, k4 = 2, km = ^ + if m > 5 is odd, and fcm = 1 + 
if m > 6 is even. 

Theorem 11.21 For each e > and positive integer m, there are c, (5 > such that the following 
holds. Suppose Bi, . . . , Bm are subsets of a group G of order n such that each Bi is {p, f3) -jumbled 
with (5 < cp'^'^n. If subsets Ai C Bi for i = 1, . . . ,m are such that there are at most d\Bi\ • • • \B„i\/n 
solutions to the equation X1X2 ■ ■ ■ Xm = 1 with Xi £ Ai for all i, then it is possible to remove at 
most e\Bi\ elements from each set Ai so as to obtain sets A'- for which there are no solutions to 
X1X2 ■ ■ ■ Xm = 1 with Xi £ A[ for all i. 

We saw above that the one-sided counting lemma gives the graph removal lemma. For cycles, 
the removal lemma follows from Proposition 17.21 The version we need is stated below. 

Proposition 8.5. For every m > 3 and e > 0, there exist 6 > and c > so that any graph 
T with vertex subsets Xi, . . . ,Xm, each of size n, satisfying (Xj,Xj-(_i)r being {p, P)- jumbled with 
(3 < cp^~^^"^n for each i = 1, . . . ,m (index taken mod m) has the following property. Any subgraph 
ofV containing at most dp^n^ copies of Gm may be made Cm-free by removing at most epn^ edges, 
where we only consider embeddings of Cm into F where the i-th vertex of Cm embeds into Xi. 

Proof of Theorem M.^ Let F denote the graph with vertex set Gx {1, . . . , m}, the second coordinate 
taken modulo m, and with vertex {g,i) colored i. Form an edge from to (z, i -|- 1) in F if 

and only if 2; = yxi for some Xi £ Bi, and let Gq be a subgraph of F consisting of those edges 
with Xi £ Ai. Observe that colored m-cycles in the graph Gq correspond exactly to (m -|- l)-tuples 
{y,xi,X2, ■ ■ ■ ,Xm) with y £ G and Xi £ Ai for each i satisfying X1X2 ■ ■ -Xm = 1- The hypothesis 
implies that there are at most (J |-Bi| • • • \Bm\ < 62"^p"^n'^ colored m-cycles in the graph Gq, where 
we assumed that c < ^ so that ^pn < \Bi\ < ^pn by jumbledness. Then by the cycle removal 
lemma (Proposition 18. 5p we can choose c and 6 so that Gq can be made C^-free by removing at 
most ^pn^ edges. 

In Ai, remove the element Xi if at least ^ edges of the form {y, i){yxi,i + 1) have been removed. 
Since we removed at most -^pn^ edges, we remove at most |pn < e\Bi\ elements from each Ai. Let 
A'j^ denote the remaining elements of A^. For any solution to X1X2 ■ ■ ■ Xm = 1 for Xi £ Ai, consider 
the n edge-disjoint m-cycles {g, l){gxi , 2){gxiX2, 3) • • • {gxi ■ ■ ■ Xm,rn) in the graph Gq for g £ G. 
We must have removed at least one edge from each of the n cycles, and so we must have removed 
at least ^ edges of the form (y, i){yxi,i + 1) for some i, which implies that Xi ^ A'-. It follows that 
there is no solution to X1X2 ■ ■ ■ Xm = 1 with Xi £ Ai for all i. □ 

In [63], the authors also proved removal lemmas for systems of equations which are graph 
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representable. For instance, the system 

XlX2X^^X^^ = 1 
XlX2X^^ = 1 

can be represented by the graph below, in the sense that solutions to the above system correspond 
to embeddings of this graph into some larger graph with vertex set G x {1, . . . , 4}, similar to how 
solutions to xiX2 • • • x„ = 1 correspond to cycles in the proof of Theorem 1 1.21 We refer to the paper 
[63] for the precise statements. These results can be adapted to the sparse setting in a manner 
similar to Proposition 18.51 




9 Concluding remarks 

We conclude with discussions on the sharpness of our results, a sparse extension of quasirandom 
graphs, induced extensions of the various counting and extremal results, other sparse regularity 
lemmas, algorithmic applications and sparse Ramsey and Turan-type multiplicity results. 

9.1 Sharpness of results 

We have already noted in the introduction that for every H there are {p, /3)-jumbled graphs F on n 
vertices, with /3 = 0{p^^^^~^'^'^^^n), such that F does not contain a copy of H. On the other hand, 
the results of Section [6] tell us that we can always find copies of in F provided that /3 < cp^^^^^~^^ 
and in G provided that /? < cp^^^^^^^. So, since d2{H) and d{H) differ by at most a constant 
factor, our results are sharp up to a multiplicative constant in the exponent for all H. However, 
we believe that our results are likely to be sharp up to an additive constant for the exponent of p 
in the jumbledness parameter, with some caveats. 

An old conjecture of Erdos [32] asserts that if is a d-degenerate bipartite graph then there 

2— - 

exists C > such that every graph G on n vertices with at least Gn <i edges contains a copy of 
H. This conjecture is known to hold for some bipartite graphs such as Kt^t but remains open in 
general. The best result to date, due to Alon, Krivelevich and Sudakov |7J, states that if G has 
Gn'^~M edges then it contains a copy of H. 

If Erdos' conjecture is true then this would mean that copies of bipartite H begin to appear 
already when the density is around n~^^'^^^\ without any need for a jumbledness condition. If 
d2{H) = d{H)- I then, even for optimally jumbled graphs, our results only apply down to densities 
of about n-V(2rfW-i). 

However, we considered embeddings of H into F such that each vertex {1, 2, . . . , m} of H is 
to be embedded into a separate vertex subset Xi. We believe that in this setting our results are 
indeed sharp up to an additive constant, even in the case H is bipartite. Without this caveat of 
embedding each vertex of H into a separate vertex subset in F, we still believe that our results 
should be sharp for many classes of graphs. In particular, we believe the conjecture \38\ [Ml 
that there is a {p, cp*~^n)-jumbled graph which does not contain a copy of Kt. 

One thing which we have left undecided is whether the jumbledness condition for appearance 
of copies of H in regular subgraphs G of F should be the same as that for the appearance of copies 
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of in r alone. For this question, it is natural to consider the case of triangles where we know 
that there are (p, cp^n) -jumbled graphs on n vertices which do not contain any triangles. That is, 
we know the embedding result for T is best possible. The result of Sudakov, Szabo and Vu [84J 
mentioned in the introduction also gives us a sharp result for the (i^s, e)-Turan property. In the 
next subsection, we will obtain a similar sharp bound for the (i^3, 2)-Ramsey property. 

While these Turan and Ramsey-type results are suggestive, we believe that the jumbledness 
condition for counting in G should be stronger than that for counting in V. The fact that the 
results mentioned above are sharp is because there are alternative proofs of Turan's theorem for 
cliques and Ramsey's theorem for the triangle which only need counting results in T rather than 
within some regular subgraph G. Such a workaround seems unlikely to work for the triangle removal 
lemma. Kohayakawa, Rodl, Schacht and Skokan [59j conjecture that the jumbledness condition in 
the sparse triangle removal lemma. Theorem 11.21 can be improved from /3 = o{p'^n) to /3 = o{p^n). 
We conjecture that the contrary holds. 



9.2 Relative quasirandomness 

The study of quasirandom graphs began in the pioneering work of Thomason [881 [89] and Chung, 
Graham, and Wilson [19j. As briefly discussed in Section ll.ll they showed that a large number 
of interesting graph properties satisfied by random graphs are all equivalent. Perhaps the most 
surprising aspect of this work is that if the number of cycles of length 4 in a graph is as one would 
expect in a binomial random graph of the same density, then this is enough to imply that the 
edges are very well-spread and the number of copies of any fixed graph is as one would expect in a 
binomial random graph of the same density. 

There has been a considerable amount of research aimed at extending quasirandomness to sparse 
graphs, see |17[ [TBI [56} I61j . However, the key property of counting small subgraphs was missing 
from previous results in this area. The following theorem extends the fundamental results in this 
area to the setting of subgraphs of (possibly sparse) pseudorandom graphs. The case p = 1 and T 
is the complete graph corresponds to the original setting. We prove that the natural analogues of 
the original quasirandom properties in this more general setting are all equivalent. Of particular 
note is the inclusion of the count of small subgraphs, a key property missing from previous results 
in this area. The proof of some of the implications extend easily from the dense case. However, to 
imply the notable counting properties, we use the counting lemma. Theorem 11.121 which acts as a 
transference principle from the sparse setting to the dense setting. 

Such quasirandomness of a structure within a sparse but pseudorandom structure is known as 
relative quasirandomness. This concept has been instrumental in the development of the hypergraph 
regularity and counting lemma [HI EH [771 [87] . the 3-uniform case, for example, one repeatedly 
has to deal with 3-uniform hypergraphs which are subsets of the triangles of a very pseudorandom 
graph. 

To keep the theorem statement simple, we first describe some notation. The co-degree ddv, v') 
of two vertices v,v' in a graph G is the number of vertices which are adjacent to both v and v' . 
For a graph we let s{H) = min | ^ d{L(iVj)+G ^ ^ ^ graph H and another graph G, 

let Nh{G) denote the number of labeled copies of H (as a subgraph) in G. 

Theorem 9.1. Let k > 2 be a positive integer. For n> 1, letV = r„ he a (p, P)-jumbled graph on 
n vertices with p = piT) and P = P{T) = o{p^n), and G = Gn be a spanning subgraph ofTn. The 
following are equivalent. 

Pi : For all vertex subsets S and T, 

\eG{S,T)-q\S\\T\\ = o{pn'). 
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P2-' For all vertex subsets S, 



eciS) - q 



\S\' 



o{pn^). 



P3; For all vertex subsets S with \S\ = [^J, 



n 



eciS) - q— 



P4: For each graph H with k > s{H), 



P5: e{G) > q^ + o{pn^) and 



NcAG)<q'n' + o{p''n'') 



Pq: e{G) > (1 + o{l))q^, Ai = (1 + o{l))qn, and A2 = o{pn), where Aj is the ith largest eigen- 
value, in absolute value, of the adjacency matrix ofG. 



Pr. 



^ \dG{v,v') - q^n\ = o{p 

v,v'£V{G) 



We briefly describe how to prove the equivalences between the various properties in Theorem l9.lt 
with a flow chart shown below. 



Pa 

P3 ^ P2 ^ Pi < P5 ^ Pi 

Pe 



The equivalence between the discrepancy properties Pi, P2, P3 is fairly straightforward and similar 
to the dense case. Theorem 11.121 shows that Pi implies P4. As P5 is a special case of P4, we 
have that P4 implies P5. Proposition 15.41 shows that P5 implies Pi. The fact P5 implies Pq follows 
easily from the identity that the trace of the fourth power of the adjacency matrix of G is both the 
number of closed walks in G of length 4, and the sum of the fourth powers of the eigenvalues of the 
adjacency matrix of G. The fact that Pg implies Pi is the standard proof of the expander mixing 
lemma. The fact P5 implies P7 follows easily from the identity 




where the sum is over all (2) pairs of distinct vertices, as well as the identity 

v.v' V ^ ^ 
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and two applications of the Cauchy-Schwarz inequality. Finally, we have P^ implies P5 for the 
following reason. From p4p . we have P5 is equivalent to 

^dGK^;')' = ^A' + o(A'). (15) 

v,v' 

To verify (|15p . we split up the sum on the left into three sums. The first sum is over pairs v^v' 
with \dG{v,v') — (p"n\ = o{p'^n), the second sum is over pairs v,v' with dciv^v') > 2p^n, and 
the third sum is over the remaining pairs v,v'. From P7, almost all pairs v^v' of vertices satisfy 
\dG{v,v') — cp'n\ = o{p^n), and so the first sum is ^q^n^ + o(p^n^). The second sum satisfies 

dGiv,v'f< Yl driv,v'f = oip\^), 

v,v' :dG{v,v')>2p^n v,v' ■.dY{v,v')>2p^n 

where the first inequality follows from G is a subgraph of P, and the second inequality follows 
from pseudorandomness in P. Finally, as Pj implies there are o(n^) pairs v, v' not satisfying 
\dQ{v,v') — q'^n\ = o{p'^n), and the terms in the third sum are at most 2p^n, the third sum is 
o(p'*n^). This completes the proof sketch of the equivalences between the various properties in 
Theorem 19.11 



9.3 Induced extensions of counting lemmas and extremal results 

With not much extra effort, we can establish induced versions of the various counting lemmas and 
extremal results for graphs. We assume that we are in Setup 14.11 with the additional condition 
that, in Setup 13. H the graph P satisfies the jumbledness condition for all pairs ab of vertices and 
not just the sparse edges of H. Define a strongly induced copy of H in G to be a copy of in G 
such that the nonedges of the copy of H are nonedges of P. Since G is a subgraph of P, a strongly 
induced copy of H is an induced copy of H. Define 

G*{H):= / G{xi,Xj) {I -T{xi,Xj)) dxi- ■ ■ dxm 

Jxi(^Xi,...,Xm(^X^ (i,j)g£;(H) {i,j)^E(H) 

and 

q*{H):= n n 

Note that G*{H) is the probability that a random compatible map forms a strongly induced copy 
of H, and q*{H) is the idealized version. Also note that if P is (p, /3)-jumbled, then its complement 
f is (1 — p, /3)-jumbled. Hence, for p small, we expect that most copies of H guaranteed by Theorem 
I1.14l are strongly induced. This is formalized in the following theorem, which is an induced analogue 
of the one-sided counting lemma. Theorem 11.141 

Theorem 9.2. For every fixed graph H on vertex set {1,2, ... ,m} and every 9 > 0, there exist 
constants c > and e > such that the following holds. 

Let r be a graph with vertex sets Xi, . . . , Xm and suppose that p < ^ and the bipartite graph 
{Xi,Xj)-p is (p, y^|Xj| \Xj\)-jumbled for every i < j. Let G be a subgraph ofT, with the 

vertex i of H assigned to the vertex set Xi of G. For each edge ij in H, assume that {Xi,Xj)G 
satisfies DISC(gij ,p,e). Then 

G*{H) > {l-e)q*{H). 
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We next discuss how the proof of Theorem 19.21 is a minor modification of the proof of Theo- 
rem 11.141 As in the proof of Theorem 11.141 after j — 1 steps in the embedding, we have picked 
f{vi), . . . ,f{vj-i) and have subsets T{i,j — 1) C Xi ior j < i < m which consist of the possible 
vertices for f{vi) given the choice of the first j — 1 embedded vertices. We are left with the task 
of picking a good set W{j) C T{j,j — 1) of possible vertices w = f{vj) to continue the embedding 
with the desired properties. We will guarantee, in addition to the three properties stated there 
(which may be maintained since d2{H) + 3< d{H) + I), that 

4. \Nf{w) n T{i,j — 1)1 > (1 —p — y/c)\T{i,j — 1)| for each i > j which is not adjacent to j. 

As for each such z, if w is chosen for f{vj), T{i,j) = Ny{w) D T{i,j — 1), this will guarantee 
that \T{i,j)\ > (1 — p — ^/c)\T(i,j — 1)|. As for each such i, the set T{i,j) is only slightly smaller 
than T{i,j — 1), this will affect the discrepancy between each pair of sets by at most a factor 
(l—p — \/c)^. This additional fourth property makes the set W{j) only slightly smaller. Indeed, to 
guarantee this property, we need that for each of the nonneighbors i > j of j, the vertices w with 
fewer than (l—p — ^/c)\T{i,j — l)\ nonneighbors in T{i,j — 1) in graph T are not in W{j), and there 
are at most such vertices for each i by Lemma 13.71 As there are at most m choices of i, 



and \T{i,j — 1)| ^ (^1 lii^j — we get that satisfying this additional fourth property 

requires that the number of additional vertices deleted to obtain W{j) is at most 

mcp^^(^)+5 , mcp^'^(^)+^ 



which is neglible since both q{i,j — 1) and q{j,j — 1) are at most p'^(^). We therefore see that, after 
changing the various parameters in the proof of Theorem 11.141 slightly, the simple modification of 
the proof sketched above completes the proof of Theorem 19.21 We remark that the assumption 
P < ^ can be replaced by p is bounded away from 1, which is needed as we must guarantee that 
the nonedges of the induced copy of H must be nonedges of T. We also note that the exponent of 
p in the jumbledness assumption in Theorem 19.21 is d{H) + | and not d2{H) + 3 for the following 
reason. In addition to using the inheritance of regularity to get that edges of H map to edges of G 
in the strongly induced copies of H we are counting, we also use jumbledness of T to get that the 
nonedges of H map to nonedges of T. 

In the greedy proof sketched above, to conclude that the good set W{j) C T{j, j — 1) of possible 
vertices w = f{vj) is large, we use the jumbledness of T and that, for the vertices i > j not adjacent 
to J, the product \T{j,j — l)||T(z, j — 1)| is large. The sizes of these sets are related to the number 
of neighbors of j less than j and the number of neighbors of i less j, respectively. 

The induced graph removal lemma was proved by Alon, Fischer, Krivelevich, and Szegedy [5]. 
It states that for each graph H and e > there is 5 > such that every graph on n vertices with 
at most Sn"^^^ induced copies of H can be made induced H-hee by adding or deleting at most en^ 
edges. This clearly extends the original graph removal lemma. To prove the induced graph removal 
lemma, they developed the strong regularity lemma, whose proof involves iterating Szemeredi's 
regularity lemma many times. A new proof of the induced graph removal lemma which gives an 
improved quantitative estimate was recently obtained in |23j . 

The first application of Theorem 19.21 we discuss is an induced extension of the sparse graph 
removal, Theorem ll.il It does not imply the induced graph removal lemma above. 

Theorem 9.3. For every graph H and every e > 0, there exist 5 > and c > such that if 
then any {p, (3) -jumbled graph F on n vertices with p < ^jpjy has the following 
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property. Any subgraph ofT containing at most Sp'^^^^n'"^^^ (strongly) induced copies of H may be 
made H-free by removing at most epn^ edges. 

The proof of Theorem 19.31 is the same as the proof of Theorem ll.H except we replace the 
one-sided counting lemma, Theorem ll.141 with its induced variant, Theorem 19.21 Note that unhke 
the standard induced graph removal lemma, here it suffices only to delete edges. Furthermore, all 
copies of H, not just induced copies, are removed by the deletion of few edges. 

The induced Ramsey number ri^^(H;r) is the smallest natural number N for which there is 
a graph G on N vertices such that in every r-coloring of the edges of G there is an induced 
monochromatic copy of H. The existence of these numbers was independently proven in the early 
1970s by Deuber [27], Erdos, Hajnal and Posa [33], and Rodl [73]. The bounds that these original 
proofs give on r[^(i{H,r) are enormous. However, Trotter conjectured that the induced Ramsey 
number of bounded degree graphs is at most polynomial in the number of vertices. That is, for 
each A there is c(A) such that ri^d{H; 2) < n^^^\ This was proved by Luczak and Rodl |68] . who 
gave an enormous upper bound on c(A), namely, a tower of twos of height O(A^). More recently. 
Fox and Sudakov [39] proved an upper bound on c(A) which is 0(Alog A). These results giving a 
polynomial bound on the induced Ramsey number of graphs of bounded degree do not appear to 
extend to more than two colors. 

A graph G is induced Ramsey {A,n,r)-universal if, for every r-edge-coloring of G, there is a 
color for which there is a monochromatic induced copy in that color of every graph on n vertices 
with maximum degree A. Clearly, if G is induced Ramsey (A, n, r)-universal, then ri^^iH; r) < \G\ 
for every graph H on n vertices with maximum degree A. 

Theorem 9.4. For each A and r there is C = C(A,r) such that for every n there is an induced 
Ramsey {A,n,r)-universal graph on at most vertices. 

The exponent of n in the above result is best possible up to a multiplicative factor. This is 
because even for the much weaker condition that G contains an induced copy of all graphs on n 
vertices with maximum degree A, the number of vertices of G has to be r2(n^/^) (see, e.g., |14|). 

We have the following immediate corollary of Theorem 19.41 improving the bound for induced 
Ramsey numbers of bounded degree graphs. It is also the first polynomial upper bound which 
works for more than two colors. 

Corollary 9.5. For each A and r there is C = C(A,r) such that rind{H;r) < Gn^^'^^ for every 
n-vertex graph H of maximum degree A. 

We next sketch the proof of Theorem 19.41 The proof builds on ideas used in the proof of 
Chvatal, Rodl, Szemeredi, and Trotter [2DJ that Ramsey numbers of bounded degree graphs grow 
linearly in the number of vertices. We claim that any graph G on = Gn"^^^^ vertices which 
is (p, /?) -jumbled with p = ^ and (3 = 0{^/pN) is the desired induced Ramsey (A, n, r)-universal 
graph. Such a graph exists as almost surely G{N,p) has this jumbledness property. Note that 
/? = cp'^^^^'^^ N with c = O(p^). We consider an r-coloring of the edges of G and apply the 
multicolor sparse regularity lemma so that each color satisfies a discrepancy condition between 
almost all pairs of parts. Using Turan's theorem and Ramsey's theorem in the reduced graph, 
we find A + 1 parts Ai, . . . , Aa+i, each pair of which has density at least ^ in the same color, 
say red, and satisfies a discrepancy condition. Let be a graph on n vertices with maximum 
degree A, so has chromatic number at most A + 1. Assign each vertex a ol H to some part so 
that the vertices assigned to each part form an independent set in H. We then use the induced 
counting lemma, Theorem 19.21 to get an induced monochromatic red copy of H. We make a couple 
of observations which are vital for this proof to work, and one must look closely into the proof 
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of the induced counting lemma to verify these claims. First, we can choose the constants in the 
regularity lemma and the counting lemma so that they only depend on the maximum degree A and 
the number of colors r and not on the number n of vertices. Indeed, in addition to the at most 2A 
times that we apply inheritance of regularity, the discrepancy-parameter increases by a factor of 
at most {1 - p- a/c)~^" = (1 - = (1 - = 0(1) due to the restrictions imposed 

by the non-edges of H. So we lose a total of at most a constant factor in the discrepancy, which 
does not affect the outcome. Second, as we assigned some vertices of H to the same part, they may 
get embedded to the same vertex. However, one easily checks that almost all the embeddings of H 
in the proof of the induced counting lemma are one-to-one, and hence there is a monochromatic 
induced copy of H. Indeed, as there are less than n vertices which are previously embedded at 
each step of the proof of the induced counting lemma, and W{j) S> n, then there is always a vertex 
w G W{j) to pick for f{vj) to continue the embedding. This completes the proof sketch. 

In the proof sketched above, the use of the sparse regularity lemma forces an enormous upper 
bound on C(A,r), of tower-type. However, all we needed was A + 1 parts such that the graph 
between each pair of parts has density at least ^ in the same color and satisfies a discrepancy 
condition. To guarantee this, one does not need the full strength of the regularity lemma, and 
the sparse version of the Duke-Lefmann-Rodl weak regularity lemma discussed in Subsection 19.41 is 
sufficient. This gives a better bound on C(A,r), which is an exponential-tower of constant height. 

The last application we mention is an induced extension of the sparse Erdos-Stone-Simonovits 
theorem. Theorem II. 4i We say that a graph T is induced {H,e)-Turdn if any subgraph of F with 
at least (1 — ^(^n-f^i + e)e(F) edges contains a strongly induced copy of H. 

Theorem 9.6. For every graph H and every e > 0, there exists c > such that if /3 < cp'^^^^'^^n 
then any (p, P)- jumbled graph on n vertices with p < is induced {H, e)-Turdn. 

The proof of Theorem 19.61 is the same as the proof of Theorem 11.41 except we replace the 
one-sided counting lemma, Theorem II. 14^ with its induced variant, Theorem 19.21 

9.4 Other sparse regularity lemmas 

The sparse regularity lemma, in the form due to Scott [80], states that for every e > and positive 
integer m, there exists a positive integer M such that every graph G has an equitable partition into 
k pieces Vi, V2, • • • , with m < k < M such that all but efc^ pairs (V^, Vj)g satisfy DlSC{pij,pij, e) 
for some pij. The additional condition of jumbledness which we imposed in our regularity lemma. 
Theorem 1 1.11 1 was there so as to force all of the pij to be p. If this were not the case, it could easily 
be that all of the edges of the graph bunch up within a particular bad pair, so the result would tell 
us nothing. 

In our results, we made repeated use of sparse regularity. While convenient, this does have its 
limitations. In particular, the bounds which the regularity lemma gives on the number of pieces 
M in the regular partition is (and is necessarily \23\ 146) ) of tower-type in e. This means that the 
constants c~^ which this method produces for Theorems 11.1} [L^ 1 1 . 6| and ll.Sl are also of tower-type. 

In the dense setting, there are other sparse regularity lemmas which prove sufficient for many of 
our applications. One such example is the cylinder regularity lemma of Duke, Lefmann and Rodl 
[29] . This lemma says that for a fe-partite graph, between sets Vi, V2, . . . , V^, there is an e-regular 
partition of the cylinder x • • • x into a relatively small number of cylinders K = W\ x • • • x VF^, 
with Wi C for \ <i <k. The definition of an e-regular partition here is that all but an e- fraction 
of the /c-tuples (vi, . . . , f^) G Vi x • • • x Vfc are in e-regular cylinders, where a cylinder W\ x • • • x 
is e-regular if all (2) pairs (TFj, Wj), 1 < i < j < A;, are e-regular in the usual sense. 
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For sparse graphs, a similar theorem may be proven by appropriately adapting the proof of Duke, 
Lefmann and Rodl using the ideas of Scott. Consider a /c-partite graph, between sets Vi,V2, ■ ■ ■ ,Vk. 
We will say that a cylinder K = WiX ■ ■ ■ x Wk, with Wj C T/j for 1 < z < k, satisfies I)lSC{qK,PK, e) 
with qx = {qij)i<i<j<k and px = {Pij)i<i<j<k if all (2) pairs {Wi,Wj), 1 < i < j < k, satisfy 
DISC{qij,pij, e). The sparse version of the cylinder regularity lemma is now as follows. 

Proposition 9.7. For every e > and positive integer k, there exists 7 > such that if G = (V,E) 
is a k-partite graph with k-partition V = ViL) ■ ■ ■ UVk then there exists an e-regular partition fC of 
Vi X • • • X Vfc into at most cylinders such that, for each K £ IC with K = Wi x • • • x Wk and 
l<i<k, \Wi\ > -f\Vi\. 

The constant 7 is at most exponential in a power of ke~^. Moreover, this theorem is sufficient 
for our applications to Turan's theorem and Ramsey's theorem. This results in constants which 
are at most double exponential in the parameters |-?^|, e and r for Theorems 11.41 and 11.61 

For the graph removal lemma, we may also make some improvement, but it is of a less dramatic 
nature. As in the dense case [37], it shows that in Theorem ll.il we may take and to be a 
tower of twos of height logarithmic in e~^. The proof essentially transfers to the sparse case using 
the sparse counting lemma. Theorem 11.141 



9.5 Algorithmic applications 

The algorithmic versions of Szemeredi's regularity lemma and its variants have applications to fun- 
damental algorithmic problems such as max-cut, max fe-sat, and property testing (see [9] and its ref- 
erences). The result of Alon and Naor [8] approximating the cut-norm of a graph via Grothendieck's 
inequality allows one to obtain algorithmic versions of Szemeredi's regularity lemma [3j, the Frieze- 
Kannan weak regularity lemma [21] , and the Duke-Lefmann-Rodl weak regularity lemma. Many of 
these algorithmic applications can be transferred to the sparse setting using algorithmic versions 
of the sparse regularity lemmas, allowing one to substantially improve the error approximation in 
this setting. Our new counting lemmas allows for further sparse extensions. We describe one such 
extension below. 

Suppose we are given a graph H on h vertices, and we want to compute the number of copies of 
H in a graph G on n vertices. The brute force approach considers all possible /i-tuples of vertices 
and computes the desired number in time 0{n^). The Duke-Lefmann-Rodl regularity lemma was 
originally introduced in order to obtain a much faster algorithm, which runs in polynomial time with 
an absolute constant exponent, at the expense of some error. More precisely, for each e > 0, they 
found an algorithm which, given a graph on n vertices, runs in polynomial time and approximates 
the number of copies of -ff as a subgraph to within en^. The running time is of the form C(/i, e)n'^, 
where c is an absolute constant and C(/i, e) is exponential in a power of /le^^. We have the following 
extension of this result to the sparse setting. The proof transfers from the dense setting using the 
algorithmic version of the sparse Duke-Lefmann-Rodl regularity lemma. Proposition 19.71 and the 
sparse counting lemma. Theorem 11.121 For a graph we let s{H) = min | ^ rf(iCff))+6|^ 

Proposition 9.8. Let H he a graph on h vertices with s{H) < k and e > 0. There is an absolute 
constant c and another constant C = C{e,h) depending only exponentially on h€~^ such that the 
following holds. Given a graph G on n vertices which is known to be a spanning subgraph of a 
{p, 13) -pseudorandom graph with (3 < C^^p^n, the number of copies of H in G can he computed up 
to an error ep^^^)n"^^^ in running time Cn^. 
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9.6 Multiplicity results 

There are many problems and results in graph Ramsey theory and extremal graph theory on the 
multiplicity of subgraphs. These results can be naturally extended to sparse pseudorandom graphs 
using the tools developed in this paper. Indeed, by applying the sparse regularity lemma and the 
new counting lemmas, we get extensions of these results to sparse graphs. In this subsection, we 
discuss a few of these results. 

Recall that Ramsey's theorem states that every 2-edge-coloring of a sufficiently large complete 
graph Kn contains at least one monochromatic copy of a given graph H. Let CH,n denote the 
fraction of copies of H in Kn that must be monochromatic in any 2-edge-coloring of G. By an 
averaging argument, CH,n is a bounded, monotone increasing function in n, and therefore has a 
positive limit ch as n — )• oo. The constant ch is known as the Ramsey multiplicity constant for the 
graph H. It is simple to show that ch < 2^"™ for a graph H with m = e{H) edges, where this 
bound comes from considering a random 2-edge-coloring of Kn with each coloring equally likely. 

Erdos [H] and, in a more general form. Burr and Rosta [13] suggested that the Ramsey multi- 
plicity constant is achieved by a random coloring. These conjectures are false as was demonstrated 
by Thomason [90] even for H being any complete graph Kt with t > 4. Moreover, as shown in 
[36j . there are graphs H with m edges and ch < 77),-'^/2+o(m) ^ which demonstrates that the random 
coloring is far from being optimal for some graphs. 

For bipartite graphs the situation seems to be very different. The edge density of a graph is the 
fraction of pairs of vertices that are edges. The conjectures of Erdos-Simonovits [83] and Sidorenko 
[81j suggest that for any bipartite H the number of copies of H in any graph G on n vertices with 
edge density p bounded away from is asymptotically at least the same as in the n-vertex random 
graph with edge density p. This conjecture implies that ch = 2^~"^ if H is bipartite with m edges. 
The most general results on this problem were obtained in [21] and [65], where it is shown that 
every bipartite graph H which has a vertex in one part complete to the other part satisfies the 
conjecture. 

More generally, let CH,r denote the fraction of copies of in F that must be monochromatic in 
any 2-edge-coloring of F. For a graph F with n vertices, by averaging over all copies of F in Kn, 
we have CH,r < CH,n ^ ch- It is natural to try to find conditions on F which imply that CH,r is 
close to Ch- The next theorem shows that if F is sufficiently jumbled, then CH,r is close to ch- The 
proof follows by noting that the proportion of monochromatic copies of H in the weighted reduced 
graph R is at least cji/ This count then transfers back to F using the one-sided counting lemma. 
We omit the details. 

Theorem 9.9. For each e > and graph H , there is c> such that ifV is a {p, /3)-jumbled graph on 
n vertices with j3 < cp'^'^^^^^^n then every 2-edge-coloring ofT contains at least {ch — e)p^'^^^ n'"^^^ 
labeled monochromatic copies of H. 

Maybe the earliest result on Ramsey multiplicity is Goodman's theorem [35], which determines 
CiCg.n and, in particular, implies ck^ = \- The next theorem shows an extension of Goodman's 
theorem to pseudorandom graphs, giving an optimal jumbledness condition to imply cn^r = 

Theorem 9.10. If T is a {p, (3) -jumbled graph on n vertices with (3 < j^p^n, then every 2-edge- 
coloring ofV contains at least [p^ — lOp^)^ monochromatic triangles. 

The proof of this theorem follows by first noting that T = A + 2M, where A denotes the number 
of triangles in F, M the number of monochromatic triangles in F, and T the number of ordered 
triples (a, b, c) of vertices of F which form a triangle such that (a, b) and (a, c) are the same color. 
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We then give an upper bound for A and a lower bound for T using the jumbledness conditions and 
standard inequahties. We omit the precise details. 

The previous theorem has the following immediate corollary, giving an optimal jumbledness 
condition to imply that a graph is (K^, 2)-Ramsey. 

2 

Corollary 9.11. If T is a {p, P)- jumbled graph on n vertices with j3 < j^n, then T is {K^,2)- 
Ramsey. 

Define the Turan multiplicity PH,d,n to be the minimum, over all graphs G on n vertices with 
edge density at least d, of the fraction of copies of H in Kn which are also in G. Let pH^d be the 
limit of PH,d,n as n — )• oo. This limit exists by an averaging argument. The conjectures of Erdos- 
Simonovits [83] and Sidorenko [81] mentioned earlier can be stated as pH,d = d^'^^^ for bipartite 
H. Recently, Reiher [73], extending work of Razborov [72] and Nikiforov [70] for f = 3 and 4, 
determined PKt,d for alH > 3. 

We can similarly extend these results to the sparse setting. Let PH,d,r be the minimum, over 
all subgraphs G of F with at least de{T) edges, of the fraction of copies of in F which are also in 
G. We have the following result, which gives jumbledness conditions on F which imply that PH,d,r 
is close to PH,d- 

Theorem 9.12. For each e > and graph H there is c> such that if T is a {p, (3) -jumbled graph 
on n vertices with /3 < cp'^^^^^'^^n then every subgraph of T with at least de{T) edges contains at 
least {pH,d — e)p^^^'> n"^^^ labeled copies of H. 
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